Over the last year, our team has interviewed more than 200 companies about their data integration use cases. What we discovered is that data integration in 2021 is still a mess.

The Unscalable Current Situation

At least 80 of the 200 interviews were with users of existing ETL technology, such as Fivetran, StitchData and Matillion. We found that every one of them were also building and maintaining their own connectors even though they were using an ETL solution (or an ELT one — for simplicity, I will just use the term ETL). Why?

We found two reasons:

  1. Incomplete coverage for connectors

Inability to cover all connector needs

Many users’ ETL solution didn’t support the connector they wanted, or supported it but not in the way they needed.

An example for context: Fivetran has been in existence…

In early March 2021, we announced our $5.2M seed round with Accel for Airbyte — our open-source data integration platform; 2 months later, we announced our $26M Series-A round led by Benchmark.

In this article, we want to tell you about what happened behind the scenes, including the deck we presented to the Benchmark team. We hope this will give you some insights about the fundraising process, even though our case might be atypical.

If you like what we do, don’t hesitate to subscribe to our newsletter or star our GitHub project!

If you’re mostly interested in the deck, here’s…

For the context, Airbyte is an open-source data integration platform. Our goal is to commoditize data integration. In January, we shared how we were thinking about OKRs, along with our OKRs for Q1 2021. So we wanted to give some updates about them, and how they have evolved for the 2nd quarter.

Our focus for 2021 is to become the open-source standard for replicating data. This entails three overarching goals:

  1. Making Airbyte just work whatever your data infrastructure, volume and connector needs.

In this article, we will show you how you can understand how much your team leverages Zoom, or spends time in meetings, in a couple of minutes. We will be using Airbyte (an open-source data integration platform) and Tableau (a business intelligence and analytics software) for this tutorial.

Here is what we will cover:

  • Step 1: Setting up data replication from Zoom to a PostgreSQL database using the Airbyte Zoom connector

We will produce the following charts in Tableau:

  • Evolution of the number…

There are quite a few decks already available online, and we found them all useful, each in different ways. That’s why we thought the deck we used for our seed round could be helpful to some companies, especially those that are open source or developer tools. For context: on that seed deck, we started working on open-source data integration platform in the end of July, and raised our seed round with Accel in December, so only 5 months in. We’re assuming the deck was acceptable, as we raised the seed after only 13 days in the fundraising process.

We will…

We try to limit our discussions with VCs, as they can easily become a distraction. As a startup, focus is what will differentiate between success and failure. But sometimes, we can’t refuse an introduction and a discussion, as some investors have a lot of insights on your industry.

Recently, we had one discussion with a top-tier VC general partner. In addition to a lot of feedback and insights, one question in particular he asked really struck me: “What is your truth for 2021?”

In this article, we will explain what he means by truth, and what our immediate answer was…

This article will show how to use Airbyte — an open-source data integration platform — and Apache Superset — an open-source data exploration platform — in order to build a Slack activity dashboard showing:

  • Total number of members of a Slack workspace

Before we get started, let’s take a high-level look at how we are going to achieve creating a Slack dashboard using Airbyte and Apache Superset.

We will use the Airbyte’s Slack connector to get the data…

The Slack free tier saves only the last 10K messages. For social Slack instances, it may be impractical to upgrade to a paid plan to retain these messages. Similarly, for an open-source project like Airbyte where we interact with our community through a public Slack instance, the cost of paying for a seat for every Slack member is prohibitive.

However, searching through old messages can be really helpful. Losing that history feels like some advanced form of memory loss. What was that joke about Java 8 Streams? This contributor question sounds familiar — haven’t we seen it before? …

When you’re selling or considering purchasing a B2B tool, you need to understand the build vs. buy argument. What are the pros and cons of building the tool internally vs. buying the tool from a third-party vendor? This is especially true in big companies where you have the resources to build the said tools. Early-stage startups will generally opt for the faster route, going with self-served B2B tools — unless the pricing is prohibitive.

But something we don’t often think about is how open-source just messes the whole thing up. The build is completely redefined. You now need to compare…

At Airbyte, we’re very transparent on our journey. We’re building an open-source EL(T) platform, I guess it’s easier to be fully open and transparent when building open-source technology as it must be part of your DNA. In any case, we hope the lessons from our journey can help other (open-source or not) entrepreneurs in their own journey.

Our whole team has had diverse experiences with OKRs (Objectives and Key Results). We’ve seen them implemented in very useful ways, and in some cases in non-productive pressure-inducing ways. …

John Lafleur

Co-Founder of Airbyte, the new open-source standard for data integrations. Author at SDTimes, Linux.com, TheNewStack, Dzone… Happy husband and dad :)

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store