Capstone
Congrats on making it this far! In this section, you’ll apply what you’ve learned so far and build out your own Dagster project.
Project data
The best way to apply what you’ve learned is to use your own organization’s data, but if this isn’t possible, you can use additional data from NYC OpenData or GitHub’s list of public APIs.
If using NYC OpenData, we recommend using the 311 Service Requests from 2010 to Present dataset. The data is downloadable as a CSV and you’re welcome to view details about it here.
Whatever data you decide to use, we choose something that sparks your interest!
You can also continue to use DuckDB if needed.
Project requirements
When building your own project, it should:
- Create a data pipeline using assets
- Try out one of Dagster’s supported integrations like dbt, Fivetran, or Airbyte
- Include a resource that connects to a data source or storage
- We provide out-of-the-box connections to most popular cloud data warehouses. However, you can turn any Python connection or object into a resource.
- If you aren’t going to use company data, you can continue using DuckDB.
- Perform some transformations on the data, either using cloud compute or in-memory computations.
- If you’re interested in using dbt, check out our dbt integration!
- Partition your assets and try backfilling historical data
- Make a report using the data
Getting help
If you get stuck or have questions, you can:
- Join the Dagster Slack community and ask a question in our
#ask-community
channel - Find solutions and patterns in our GitHub discussions
- Check out the Dagster Docs
Share your work!
We’d love to see what you create! When you’re done, you can:
- Add the project to a public GitHub repository
- Tag us on your socials, like Twitter/X or LinkedIn! We’re
@dagster
on both platforms. - Join the Dagster Slack community and share your work in
#dagster-showcase