Connector Development
There are two types of connectors in Airbyte: Sources and Destinations. Connectors can be built in any programming language, as long as they're built into docker images that implement the Airbyte specification.
Most database sources and destinations are written in Java, and most API sources and destinations are written in Python using our Python CDK, or low-code CDK.
If you're looking to build a connector for an API Source, we strongly suggest starting in the Connector Builder.
If you need help with connector development, we offer premium support to our open-source users, talk to our team to get access to it.
Connector Builder
The connector builder UI is based on the low-code development framework below and allows to develop and use connectors without leaving the Airbyte web UI. No local developer environment required.
Low-code Connector-Development Framework
You can use the low-code framework to build source connectors for HTTP API sources. Low-code CDK is a declarative framework that provides a YAML schema to describe your connector without writing any Python code, but allowing you to use custom Python code if required.
Python Connector-Development Kit (CDK)
You can build a connector in Python with the Airbyte CDK. Compared to the low-code CDK, the Python CDK is more flexible, but building the connector will be more involved. It provides classes that work out of the box for most scenarios, and Airbyte provides generators that make the connector scaffolds for you.
Community maintained CDKs
The Airbyte community also maintains some CDKs:
- The Typescript CDK is actively maintained by Faros.ai for use in their product.
- The Airbyte Dotnet CDK in C#.
The Airbyte specification
Before building a new connector, review Airbyte's data protocol specification.
Adding a new connector
The easiest way to make and start using a connector in your workspace is by using the Connector Builder.
If you're writing your connector in Python or low-code CDK, use the generator to get the project started:
cd airbyte-integrations/connector-templates/generator
./generate.sh
and choose the relevant template by using the arrow keys. This will generate a new connector in the airbyte-integrations/connectors/<your-connector>
directory.
Search the generated directory for "TODO"s and follow them to implement your connector. For more detailed walkthroughs and instructions, follow the relevant tutorial:
- Speedrun: Building a HTTP source with the CDK
- Building a HTTP source with the CDK
- Building a Java destination
As you implement your connector, make sure to review the Best Practices for Connector Development guide.
Updating an existing connector
The steps for updating an existing connector are the same as for building a new connector minus the need to use the autogenerator to create a new connector. Therefore the steps are:
- Iterate on the connector to make the needed changes
- Run tests
- Add any needed docs updates
- Create a PR to get the connector published
Adding Typing and Deduplication to a connector
Coming soon.
Typing and Deduplication is how Airbyte transforms the raw data which is transmitted during a sync into easy-to-use final tables for database and data warehouse destinations. For more information on how typing and deduplication works, see this doc.
Publishing a connector
Once you've finished iterating on the changes to a connector as specified in its README.md
, follow these instructions to ship the new version of the connector with Airbyte out of the box.
- Bump the docker image version in the metadata.yaml of the connector.
- Submit a PR containing the changes you made.
- One of Airbyte maintainers will review the change in the new version and make sure the tests are passing.
- You our an Airbyte maintainer can merge the PR once it is approved and all the required CI checks are passing you.
- Once the PR is merged the new connector version will be published to DockerHub and the connector should now be available for everyone who uses it. Thank you!
Updating Connector Metadata
When a new (or updated version) of a connector is ready, our automations will check your branch for a few things:
- Does the connector have an icon?
- Does the connector have documentation and is it in the proper format?
- Does the connector have a changelog entry for this version?
- The metadata.yaml file is valid.
If any of the above are failing, you won't be able to merge your PR or publish your connector.
Connector icons should be square SVGs and be located in this directory.
Connector documentation and changelogs are markdown files living either here for sources, or here for destinations.
Using credentials in CI
In order to run integration tests in CI, you'll often need to inject credentials into CI. There are a few steps for doing this:
- Place the credentials into Google Secret Manager(GSM): Airbyte uses a project 'Google Secret Manager' service as the source of truth for all CI secrets. Place the credentials exactly as they should be used by the connector into a GSM secret here i.e.: it should basically be a copy paste of the
config.json
passed into a connector via the--config
flag. We use the following naming pattern:SECRET_<capital source OR destination name>_CREDS
e.g:SECRET_SOURCE-S3_CREDS
orSECRET_DESTINATION-SNOWFLAKE_CREDS
. - Add the GSM secret's labels:
connector
(required) -- unique connector's name or set of connectors' names with '_' as delimiter i.e.:connector=source-s3
,connector=destination-snowflake
filename
(optional) -- custom target secret file. Unfortunately Google doesn't use '.' into labels' values and so Airbyte CI scripts will add '.json' to the end automatically. By default secrets will be saved to./secrets/config.json
i.e:filename=config_auth
=>secrets/config_auth.json
- Save a necessary JSON value Example.
- That should be it.
Access CI secrets on GSM
Access to GSM storage is limited to Airbyte employees. To give an employee permissions to the project:
- Go to the permissions' page
- Add a new principal to
dataline-integration-testing
:
- input their login email
- select the role
Development_CI_Secrets
- Save