Heroku Connect is a synchronization service, conceptually similar to Dropbox or iCloud, that synchronizes data between a Salesforce deployment and a Heroku Postgres database. This allows us, as developers, to leverage open-source technologies when interacting directly with Salesforce data without having to build complicated and expensive integrations (API, or otherwise) to sync data bi-directionally.
When building a customer-facing application that has data that needs to be shared with the Salesforce platform, Heroku Connect is a perfect fit. Sounds like magic, right? Let’s look at how it all works.
Here you can see the data flow between each of the services that consume Salesforce data in the overall system. Essentially, Salesforce acts as our main data store. To expose particular data objects to our web application, we use Heroku Connect (built on Postgres) that acts as our intermediary cache store. While this name might be a misnomer, I use the word cache to represent that our Heroku Connect Postgres instance acts as a persistent store that shuttles information back to the main source of truth for our data in Salesforce.
With our Salesforce data accessible in a Postgres instance, our client application can be powered by a number of popular open web frameworks (Rails, Node, Python, etc.) that have nice ORM integrations with Postgres.
How it Works
There is a lot of magic going on behind the scenes, but essentially the sync between the Salesforce data and the Heroku Postgres data is configurable in two ways: what to sync and how to sync.
What to Sync
Instead of mapping every data object in Salesforce, you can select certain objects and attributes to map to a corresponding Heroku Postgres table. Here we are mapping the Contact object to a table in our database:
Once the mapping is done, Heroku’s interface is great at displaying an overview of rows that have been synced and the latest status of the sync:
How to Sync
The more technical part of integrating with Heroku Connect is how you sync data. The first option is to periodically poll for new data which typically occurs every 10 minutes. Depending on your product needs, this frequency of syncing can be acceptable. However, for more “real-time” applications, you might need a more robust option.
This is where event-driven syncing enters using the Salesforce streaming API. While this approach reduces latency and more efficiently uses computing resources, the setup is slightly more complicated. Without going into full detail, Heroku Connect will create PushTopic handlers on your Salesforce system using the naming prefix hc_ that your Postgres instance subscribes to. You need to make sure your Salesforce account has the StreamingAPI enabled and that Heroku Connect is enabled for streaming updates. While there is a lot of “magic” going on behind the scenes, Heroku gives a nice interface for it and most of the setup is pure configuration and settings.
How to Set it Up
The best guides I’ve seen out there are here and here. It’ll walk you through the entire process in the context of a real app on Heroku which is a plus for learning the setup in a sandbox that will very closely match a production setup.
Heroku/Salesforce has put together a very nice product to make the integration of Salesforce data into a more open-source platform like Heroku Postgres a breeze. Some things to keep in mind as you try your own integration:
- When trying to sync (1) large amounts of data and/or (2) frequent data syncs between Salesforce & Heroku, consider using Heroku Connect
- You can map specific data from Salesforce to Heroku that you want to sync
- You can sync using (1) periodic polling or (2) StreamingAPI events
- Oh yeah…and make sure you have a Salesforce account. When you provision the Heroku Connect add-on it’ll ask for your Salesforce credentials