Gluing the web together with a personal webhook server

I mentioned earlier that I'm using a personal webhook server as glue between various online tools and services, and this post expands on that.

Why It Matters

For the open web to work well enough to bring people back to it, information has to be able to come and go freely between different systems without relying on centrally controlled hubs.

Just like RSS feeds, REST APIs and other tools that allow software programs to talk to each other across disparate systems, webhooks are an important part of that mix. Whereas many of those tools are "pull" technologies - one system goes out and makes a request of another system to see what's been added or changed - webhooks are a "push" technology, where one system proactively notifies another about a change or event that's occurred.

You might be using webhooks right now! Popular services like IFTTT and Zapier are built in part on webhooks in order to make certain things happen over there when something else has happened over here. "Flash my lights when my toaster is done!" And while I have great respect for what IFTTT and Zapier do and how they do it, I get a little uncomfortable when I see (A) so much of the automated, programmable web flowing through a small number of commercially oriented services, and (B) software and service owners building private API and webhook access for IFTTT and Zapier that they don't make available to the rest of us:

So I decided that just as I host RSS feeds and APIs for many of the things I build and share online (mostly through WordPress, since it makes that so easy), I also wanted to have the ability to host my own webhook endpoints.

How I Set It Up

I used the free and open source webhook server software. (It's unfortunate that they chose the same name as the accepted term for the underlying technical concept, so just know that this isn't the only way to run a webhook server.) I didn't have a Go environment set up on the Ubuntu server I wanted to use, so I installed the precompiled version from their release directory.

I created a very simple webhook endpoint in my initial hooks.json configuration file that would tell me the server was working at all:

[
  {
    "id": "webhook-monitor",
    "execute-command": "/bin/true",
    "command-working-directory": "/var/tmp",
    "response-message": "webhook is running",
  }
]

Then I started up the server to test it out:

/usr/local/bin/webhook -hooks /path/to/hooks.json -verbose -ip 127.0.0.1

This tells the webhook server to listen on the server's localhost network interface, on port 9000. When I ping that endpoint I should see a successful result:

$ curl -I http://localhost:9000/hooks/webhook-monitor
HTTP/1.1 200 OK
...

From there I set up an nginx configuration to make the server available on the Internet. Here are some snippets from that config:

upstream webhook_server {
    server 127.0.0.1:9000;
    keepalive 300;
}

limit_req_zone $request_uri zone=webhooklimit:10m rate=1r/s;

server {
    ...

       location /hooks/ {
           proxy_pass http://webhook_server;
           ...

        limit_req zone=webhooklimit burst=20;
    }

    ...
}

This establishes a proxy setup where nginx passes requests for my public-facing server on to the internally running webhook server. It also limits them to a certain number of requests per second so that some poorly configured webhook client can't hammer my webhook server too hard.

I generated a Let's Encrypt SSL certificate for my webhook server so that all of my webhook traffic will be encrypted in transit. Then I added the webhook startup script to my server's boot time startup by creating /etc/init/webhook.conf:

description "Webhook Server"
author "Chris Hardie"

start on runlevel [2345]
stop on runlevel [!2345]

setuid myuser

console log

normal exit 0 TERM

kill signal KILL
kill timeout 5

exec /usr/local/bin/webhook -hooks /path/to/hooks.json -verbose -ip 127.0.0.1 -hotreload

The hotreload parameter tells the webhook server to monitor for and load any changes to the hooks.json file right away, instead of waiting for you to restart/reload the server. Just make sure you're confident in your config file syntax. 🙂

After that, service start webhook will get things up and running.

To add new webhook endpoints, you just add on to the JSON configuration file. There are some good starter examples in the software documentation.

I strongly recommend using a calling query string parameter that acts as a "secret key" before allowing any service to call one of your webhook endpoints. I also recommend setting up your nginx server to pass along the calling IP address of the webhook clients and then match against a whitelist of allowed callers in your hook config. These steps will help make sure your webhook server is secure.

Finally, I suggest setting up some kind of external monitoring for your webhook server, especially if you start to depend on it for important actions or notifications. I use an Uptimerobot check that ensures my testing endpoint above returns the "webhook is running" string that it expects, and alerts me if it doesn't.

If you don't want to go to the trouble of setting up and hosting your own webhook server, you might look at Hookdoo, which hosts the webhook endpoints for you and then still allows you to script custom actions that result when a webhook is called. I haven't used it myself but it's built by the same folks who released the above webhook server software.

How I Use It

So what webhook endpoints am I using?

My favorite and most frequently used is a webhook that is called by BitBucket when I merge changes to the master branch of one of the code repositories powering various websites and utilities I maintain. When the webhook is called it executes a script locally that then essentially runs a git pull on that repo into the right place, and then pings a private Slack channel to let me know the deployment was successful. This is much faster than manually logging into my webserver to pull down changes or doing silly things like SFTPing files around. This covers a lot of the functionality of paid services like DeployBot or DeployHQ, though obviously those tools offer a lot more bells and whistles for the average use case.

I use a version of this webhook app for SmartThings to send event information from my various connected devices and monitors at home to a database where I can run SQL queries on them. It's fun to ask your house what it's been up to in this way.

For the connected vehicle monitoring adapter I use, they have a webhook option that sends events about my driving activity and car trips to my webhook server. I also store this in a database for later querying and reporting. I have plans to extend this to create or modify items on my to-do list based on locations I've visited; e.g. if I go to the post office to pick up the mail, check off my reminder to go to the post office.

I've asked the Fastmail folks to add in support for a generic webhook endpoint as a notification target in their custom mail processing rules; right now they only support IFTTT. There are lots of neat possibilities for "when I receive an email like this, do this other thing with my to-do list, calendar, smart home or website."

Speaking of IFTTT, they do have a pretty neat recipe component that will let you call webhooks based on other triggers they support. It's a very thoughtful addition to their lineup.

Conclusion

I'm not expecting many folks to go out and set up a webhook server; there's probably a pretty small subset of the population that will find this useful, let alone act on it. But I'm glad it's an option for someone to glue things together like this if they want to, and we need to make sure that software makers and service providers continue to incorporate support for webhooks and other technologies of an open, connected web as they go.

A year without Facebook

It's been about a year since I left Facebook, and I'm still glad I did. (I guess there were those thirty years before Facebook existed that I somehow managed without it, too.)

Some observations:

People in my circles generally continue to assume that I've seen their event invitations and life updates on Facebook, and so it's still a regular occurrence that I find out about something well after everyone else, or not at all. This is most annoying when it's something really time sensitive that I would have liked to have been a part of.

Some of my published writings have been shared extensively on Facebook, generating hundreds or even thousands of views on my various websites, but I don't have a way of knowing where that activity is coming from or what kind of conversation it might be generating there. I've had people tell me in person that they saw and liked something via Facebook, which is nice, but of course I wish they'd leave their likes and comments on my site where it's closer to the original writing, visible to the world, and not subject to later deletion by some corporate entity. (This comes up for any social network, not just Facebook, but it tends to be the one generating the most traffic for me.)

I won't make a claim that the hours I've saved by not looking at Facebook have freed me up to accomplish some amazing other thing. I will say that I felt a nice release from the self-created pressure to keep up with my interactions and profile there, and that in turn has contributed to an increase in my overall creative energy for other things.

I had one time where I needed to use the Facebook sharing debugger for a work project. I signed up for a new account to do this, but Facebook clearly found my lack of interest in populating a real-looking profile to be suspicious, and closed down the account soon after. In the end it was faster to ask a colleague with an active account to do the debugging for me and share the results. As I've said before, I think it's ridiculous and irresponsible that Facebook doesn't make that tool available to logged-out users.

I'm still surprised at how many organizations and businesses use Facebook as their one and only place for posting content; some even do it in a way that I just can't see it as a logged-out user, and others don't seem to realize that they're giving Facebook 80% of any screen real estate on the links I can see. I am now much more likely to avoid doing business with or offering my support to these entities if they don't bother offering non-Facebook ways for me to engage.

I've accepted that people will not necessarily seek out the open version of the web on their own. Being off Facebook has reinforced that there are big gaps to close in the user experiences that other tools and services offer (the WordPress/blogging ecosystem not least among them). My own efforts to migrate my content that still exists on other services like Flickr into a digital home that I fully control are slow-going, so I don't expect other people to even bother. Facebook is still the path of least resistance for now.

When the actions of Cambridge Analytica were in the news, it was tempting to feel smug about not being an active Facebook user. But I know they still have tons of information about me that is of value to advertisers and others, and that even as I use browser plugins to try to prevent Facebook from accumulating an even larger profile of my online activity, it is a losing battle until there are larger shifts in the culture and business models of technology companies.

Scoring sites on their commitment to the open web?

A month ago in a tweet related to my post about bringing people back to the open web, I casually proposed a resource that would score tools, services and other websites on their commitment to being a part of the open web. I'm back to flesh that idea out a little more.

Crude mockup of a score badge

I'm imagining a simple site that displays a score or grade for each major user-facing tool or service on the web.

The score would help users of the site know at a glance what to expect from the service when it comes to the practices and mechanics of maintaining openness on the web. A badge with the score on it could be voluntarily displayed by the sites themselves, or the score could be incorporated into a browser extension and similar tools that give visibility to the information as users explore the web.

If a site has a high score, users could confidently invest time and energy in it knowing that they'd benefit from clear ownership of their data, easy interoperability with other tools, and no proprietary lock-in. If a site has a low score, users would know that they are entering a walled garden where their data and access to it is the product.

The score or grade would be based on some easily digestible criteria. In my initial proposal these would look at the robustness of the site's API offering, the availability of standard feed options, the usefulness of export tools, the focus on user empowerment, and the level of transparency about how the service works and makes use of user data:

Continue reading "Scoring sites on their commitment to the open web?"