One long, intimate month with the Facebook API

As I announced back in December, my goal for 2011 is to launch a full-fledged software startup. But first – I decided to spend a few weeks creating a Facebook app. Hey, why not? 🙂 As many of you know, I run a Twitter over email service that has does really well within it’s particular niche. I’ve felt for a long time that a similar service would make a lot of sense for Facebook and finally decided to invest the time in creating it. Developing for the Facebook platform definitely has its challenges. I thought I’d take a few moments to reflect on the experience and share some of the technical details of working with the Facebook API.

You might have seen news coverage about the Breakup Notifier app that just came out. I was planning to release a similar feature more than 2 weeks ago. We jokingly called it the Stalker Alert. However, during a private beta test with several users, I discovered I wasn’t able to provide an accurate notification due to unreliable data returned by the Facebook API. Kudos to the developer if he found a way to make it happen but I’m quite doubtful that his app always reports accurate information. I explain why below.

Background: “Project Zuckermail”

The purpose of this post is to talk about the Facebook API but I do want to give some background on my app. If you really just want to read about the API, feel free to skip ahead.

The basic idea behind my app was to create a fully functional Facebook client that would work entirely over email. You would be able to update your status, post on friends’ walls, get your news feed, “like”, comment, and more just by sending email. I also planned to include a few interesting email alerts in my initial launch. I knew I could create better versions of Facebook’s standard email notifications for “likes” and comments, but I decided to hold off on those so that I could focus on my own unique email alerts first. I codenamed the app “Project Zuckermail”.

The two emails alerts I settled on were a Birthday Alert (obvious choice, but I have a unique implementation) and what I called the “Life Events Alert”.  Despite the boring name, The Life Events Alert was the flashy feature that I thought would really help my app get noticed. The idea was that you could create the alert and be notified whenever certain important events occurred in your friends’ lives. For example: relocations, job changes, and of course – those crazy breakups.

I originally estimated the project would take less than a month to develop based on the fact that I could re-use much of the knowledge, server configuration, and code assets from my existing Twitter app. Last week, after five weeks of development and roughly 10,000 lines of code, I finally released it. So what took so long? The Life Events Alert – and specifically, the Facebook API – really threw a wrench in my plans.

I spent nearly two weeks on the Life Events Alert feature alone, and because of data integrity and consistency problems, I ultimately killed it. I lost a lot of time and I lost my killer feature. I also had to scramble to come up with some other features to make my app “interesting enough” to get noticed. After looking around for another good gap to fill, I decided to add really nice email notifications for Facebook pages. Unfortunately, Facebook launched their own page notifications (albeit less useful) the day I finished my version of the feature. Hence, to further compensate, I went ahead and added the better email notifications for “likes” and comments that I had originally planned to put off. Sigh.

Fortunately, the end result is a fantastic app that I am really proud of. I haven’t yet received the type of blog coverage I was hoping for — you know, the kind Breakup Notifier received — but I’m getting fantastic feedback from new users. Several folks have told me that my app has helped them get more active on Facebook. In particular, I’ve received a lot of great comments from blind and visually impaired users who find my app to be a more effective way to access Facebook. Ultimately, that type of feedback is what makes building an app like this so rewarding.

In case you are wondering, I’m intentionally omitting the name of my app because I do not want this blog post to mix in with search results for the app. However, it’s ready to go and you can sign up for the free beta here. And no – I didn’t go with “Zuckermail” as the actual name. Trust me, I really wanted to. I’m sure the folks reviewing app submissions at Facebook have a sense of humor but I’m not willing to find out the hard way. 😉

The Facebook API

Alright, so let’s talk about the Facebook API. If you’ve ever looked at the developer docs then you know how easy it is to get lost. Facebook has several APIs for both server and client-side access. I primarily focused on the Graph API and FQL in my development.

Graph API

I really like the concept of the Graph API. In particular, the notion that every object in the Facebook universe (eg. person, wall post, photo) is a node in a graph with it’s own connections seems to make a lot of sense. The Graph API also feels like a true REST API – certainly more so than the API that Facebook actually refers to as their “REST API” (which is really RPC over HTTP). For example, make a request for https://graph.facebook.com/anilchawla and you get my profile. With an access_token and the right permissions, you can see more of my profile and list “connections” such as https://graph.facebook.com/anilchawla/friends and https://graph.facebook.com/anilchawla/photos.

I should point out that the URLs in the Graph API do not continue to follow a typical REST convention. For example, you might request https://graph.facebook.com/anilchawla/albums and see that I have an album with an ID of 12345. The URL for that album is https://graph.facebook.com/12345 and not https://graph.facebook.com/anilchawla/albums/12345. This might upset REST purists but I didn’t find this to be an issue in practice. It turns out that IDs in the graph are universal and this simplification is kind of nice. All in all, I’d say I’m a fan of the Graph API concept. However, as I explain below, there are some issues with the data you get back from the Graph API.

FQL

FQL is Facebook Query Language and is a SQL-like language for accessing Facebook’s data model – or at least the data model they are willing to expose via the API. To be honest, I think the concept of FQL is awesome. It’s not quite as powerful as SQL (for example, no joins) but the ability to execute your own complex database-like queries is really cool. It’s also impressive how much data you can access with FQL. For example, in order to implement my Page Notifications feature, I’m using a set of FQL queries that check for comments on any wall post, photo, or album ever created on the page. I tested it by commenting on a Starbucks post from 2009. Keeping in mind how much data Facebook processes and stores each day, it’s pretty amazing that I can run a non-trivial query across data that is nearly 20 months old.

The downside of FQL is that you are limited by the data model that Facebook exposes. There are some cases in which I have to combine results from multiple tables in order to find a specific object. Even though the user should be able to access the target object (e.g a photo), the result comes up empty because some of portion of the query fails a privacy check (e.g. can’t select the album containing the photo). The other issue is that some of the FQL tables are just bad. Really bad. For example, this is the schema of the “notification” table:

You can imagine how much fun it is to re-implement Facebook’s email notifications when all you get is some pre-generated text and HTML, and none of the underlying metadata. Perhaps this is by design?

Data integrity and consistency

It’s great if your APIs are conceptually well-designed but what really matters is how well they work. This is where I experienced the most grief with Facebook’s APIs. Based on my experience, I do not believe that 3rd-party clients can rely on the data provided by the Facebook API. If the data you need is there – great. In most cases it will be there. But don’t depend on it. I learned this lesson the hard way after completely implementing the Life Events Alert I mentioned above. This is also why I don’t think an app like Breakup Notifier can be implemented in a reliable fashion. Here are the issues I encountered:

Fields disappear and reappear randomly
When requesting a person’s profile you can indicate which fields you want to receive. For example, you can request: location, relationship_status, education. Obviously, some people might not have this information in their profiles or they might have privacy settings that prevent an app from accessing the information. Assuming the field contains a value and is accessible, though, you would think that you would always see the information in the profile, right? Wrong. It turns out that the Facebook API sometimes drops entire fields from the response even though they should be there. I consistently saw this problem with the location attribute and also saw it happen with other attributes such as relationship_status and even gender. My private beta testers kept getting notifications that there friends had moved. Oops.

Attributes within fields disappear and reappear
It was annoying to see missing fields but I thought I could work around that. Even worse, though, is that the API will sometimes drop specific attribute data within a field. For example, one feature of the Life Events Alert was that it would notify you if your friends updated their job titles or changed majors in college. Unfortunately, even if the API returns the work field and education field, it might temporarily omit the position and concentration attributes of those fields. Did John just become a Lead Developer at Google or was his position at Google just missing the last nine times we checked?

Mutable IDs
Finally, this issue was more of an annoyance than a real problem but it’s worth mentioning. It turns out that practically everything in the Graph API has an ID. For example, even the year attribute of the college you attended as its own ID. Why does 2004 need an ID? I don’t know. In any case, if you saw an object with an ID you would probably expect the ID to stay the same. Nope! During the course of just a few days, I saw the IDs of several objects change including the IDs for job titles, degree concentrations, and yes – even years. I realized quickly that I had to simply ignore IDs when comparing data.

To be fair, Facebook probably has performance and scale reasons for these “data consistency” issues. In fact, it’s probably by design and just not something they explain to developers. In retrospect, I should have probably run some better experiments before going ahead and implementing my entire friggin feature. Lesson learned.

Second-class citizenship

Despite the issues I describe above, it’s still great that Facebook provides an API for developers like myself. There is a fair amount of functionality they expose and I was able to implement almost all of the features I planned. That said, I have to accept that my app is a “second-class citizen” in the ecosystem. Here are some reasons why:

  • 3rd-party apps are limited to 10 posts per day per user through the API. I understand that this limit was added to prevent spam, but it’s really hard to be a legitimate replacement client for Facebook with such a low limit.
  • Posts made by 3rd-party apps do not get a “share” link next to them in the news feed. Shouldn’t all posts be treated equally?
  • 3rd-party apps receive inconsistent data. Yes – this is a reference to the issues I describe above. They’re worth mentioning again.
  • There is no API for “poke”. Yes – the feature that started it all and is at the core of Facebook’s multi-billion-dollar valuation – completely inaccessible by 3rd-party apps. It’s a shame. Not to mention that other APIs – such as “write” access for Facebook messaging – are missing too.

Developer Support

Lastly, this is where I think Facebook really falls short.  On the surface, Facebook seems to have a good number of resources available to developers: documentation, forums,  and bugzilla to name a few. However, read the documentation and you will inevitably find inaccurate information. Search the forums and you will see unanswered post after unanswered post. Browse bugzilla and you will see countless “unconfirmed” bugs that have been around for months. I can’t tell you how many bug reports I read in which developers had to repeatedly beg for an answer from Facebook staff. In fact, there’s a reason I’m writing a blog post here and not posting bugs and asking questions on the Facebook forum. The two questions I previously posted on the forum went completely unanswered and I quickly realized that it was a black hole. This is in stark contrast to the Twitter community in which there is an active mailing list and frequent responses from Twitter staff. Ultimately, people will continue to develop for Facebook because it is such a large and valuable platform. However, there is a lot more that Facebook can do to support its 3rd-party developers and create an ecosystem that truly thrives.

Conclusion

All in all, I’m glad I had a chance to work with the Facebook API and was able to create my Facebook over email app. Facebook will clearly be an important platform for a long time to come. I hope the information in this post helps those who are thinking about developing on the platform. I would also like to know if there is anything I misunderstood or misstated about the way the Facebook APIs work. I’ve only spent a month – albeit a long one – developing on the platform and I’m sure there is a lot more for me to learn. Leave your comments!

71 thoughts on “One long, intimate month with the Facebook API”

  1. Anil, why didn’t you just ignore fields and attributes that were missing? So if education:concentration returns "CompSci" on the first call, but then is missing on the second call, don’t treat that as a change. Don’t consider the attribute changed until the it is present and different.

    This comment was originally posted onHacker News

    Reply
  2. +1 regarding terrible developer support.

    It’s as if Facebook can’t keep up with their growth and just aren’t able to deal with developers as a result.

    I tried to develop an app that kept getting banned for exceeding the (undisclosed) API limits. I tried to talk to Facebook developer support about it, but at one point they just refused to look further into the matter, resulting in a permanently banned personal account, too. “This decision is final and cannot be appealed”.

    If you aim to be huge, expect huge amounts of support to be part of the work. With Facebook, it’s you against them and everyone else waiting for answers.

    Reply
  3. Good read. And a good reminder that you shouldn’t (or can’t) rely on documentation or a single request here and there when you decide to work with an API.And that the amount of support and ‘release cycles’ (frequency of updates and fixes) is something that you should always consider when integrating another piece of software into your own system.

    This comment was originally posted onHacker News

    Reply
  4. Right, I also thought I could work around the issues by ignoring fluctuations in the data. The problem is knowing what is the correct baseline data. In your example, the education:concentration might be missing when you first start checking and only start appearing a long time later. Do you ignore the data when it first appears or report that the person just added another major at college (which may be true!)? This is a bigger problem with fields that people might completely add/remove instead of simply updating (e.g. relationship status). I tried baking in a time period (e.g. 24 hours) in which I ignored all changes and waited for the baseline data to become obvious, but even that didn’t work because some data took even longer to re-appear. At some point I had to admit to myself that – with all these work-arounds – my simple feature had turned into a hack.

    This comment was originally posted onHacker News

    Reply
  5. We run a program that monitors a significant number of Facebook Pages. It drives me crazy because every once in a while a certain attribute, like the author, is just omitted. It’ll usually show up again later on.

    This comment was originally posted onHacker News

    Reply
  6. Having built a Facebook application that uses FB video I can understand where he’s coming from. The Graph API is a great concept but like so many things Facebook, only halfway executed. It always amazed me that there are so many differences between a video in FQL and a video in the Graph API. In the end I figured out that "begin … rescue … end" was my friend when dealing with the Facebook apis. And about any call I make to the OpenGraph API is now wrapped in them.

    This comment was originally posted onHacker News

    Reply
  7. Ha – ChromeSpeed! Glad to see you have the project page hosted and the video on YouTube. Maybe we should port to iPhone? Definitely an Angry Birds killer… OK, sorry to go off topic, we should catch up offline (or online elsewhere)!

    This comment was originally posted onHacker News

    Reply
  8. Thanks. I’m definitely keeping the code around since it’s fully implemented. I might either reduce the scope of the alerts (for example, ignore add/remove and focus on changes only) or just bring it back when I have more confidence in the API.

    This comment was originally posted onHacker News

    Reply
  9. I won’t go into the specifics of the article, because there are people far more qualified than I – I did enjoy it.There’s a lesson here in marketing: can you predict which name will get more buzz, ‘Breakup Notifier’ or ‘Friend Mail’?

    There’s another lesson in marketing – you sell the sizzle, not the steak, but if the steak is awful you won’t sell very many for very long regardless of how sizz-tastic your marketing is.

    I can’t compare both these products – and I hope they’re both solid.

    This comment was originally posted onHacker News

    Reply
  10. We just work with the API within the usual scope (fb connect, etc) but it breaks regularly and things sometimes change without notice. It’s really a joke for a company of that size. It has become kind of a running gag in the office (like: "Skip the tests, just release it. Today we do it facebook-style")

    This comment was originally posted onHacker News

    Reply
  11. Gotcha. I thought you might report updates only. The problem is that people can also remove their status instead of setting it back to "Single". Focusing on value updates only is probably fine and I’m sure nobody is going to freak out if the app misses a breakup 😛 In my case, I was trying to create a more comprehensive notification that covered events such as new relationships (i.e. the user does not have a relationship status and one day finally sets it) and I couldn’t afford the false positives.

    This comment was originally posted onHacker News

    Reply
  12. One motivation for writing this post was that I was surprised to see how often the Facebook API will drop/omit data that exists and that you should be able to access.I loved the article, but I’m a bit surprised that you’re surprised. What else would you expect from a massive NoSQL installation? It seems to me a given that they are going to trade some data integrity (in the form of missing attributes) for speed of access. I’d assume that was a fundamental design goal of the architecture.

    This comment was originally posted onHacker News

    Reply
  13. Well, the graph API reports things as JSON. Your app has to page through the resulting json and see what information is relevant and meaningful.I’m barely starting the Facebook API but the solution I’ll use is to put the code for interpretating a given query into a series of configuration files using a DSL to interpret the Json. It’s mostly a series of entries like: foo = [bar or baz or "default"] This way any change in graph structure can be dealt with quickly (this would be a problem for XML/XSLT except they are too slow AND they don’t do incremental interpretation).

    This comment was originally posted onHacker News

    Reply
  14. And again, 2 years later and nothing has changed. Of over 20 different publisher APIs that I import data from I have to isolate Facebook from the rest of the process so that when it breaks, which is about once per month, it doesn’t effect anything else.

    Reply

Leave a Comment