Bonobo needs to allow the user to define transformations in python, and link them in a directed graph. This should be as simple as possible.
Status: Useable safely. Work still in progress towards 1.0.
Links: Guide, API (todo)
Python would not be python without the cheese shop. A set of ready-to use extractors, transformers and loaders will be available in the basic Bonobo distribution in the form of "configurable classes".
Status: 25% complete, lacking a lot of standard tools.
Links: API
A concise but complete documentation should be available before 1.0. It should allow anyone with decent python skills to write his first "real world" transformation in something around 10 minutes, detail all APIs, contain the full std documentation, explain how to contribute and a decent set of examples.
Status: 50% complete, already pretty good but lacking sections.
Links: Install, First Steps, Guides, References
Transformations need to rely on external services (a database connection, an api client, an http session, etc.), but to engineer things correctly, this should not be hardcoded in the transformations (you may need to connect to a different database on your production server than what you use on your laptop ...).
Status: 50% complete.
Links: Guide, API
It relies on the Services API, but we need to be able to use SQL and relational database connections while transforming data. The code is available in the old rdc.etl codebase, a few tuning is necessary (for example, we need to use cursors instead of offset/limit, if available, to avoid concurrency problems), and a stable api and documentation should be there.
Status: 20% complete, depends on Services API to go further.
Should be pretty easy to add type checks from python 3 annotations, so one can choose to force strongly typed transformations. Unlike java ETLs, it's an opt-in feature, and is not required.
Likely to build: very high
Strategy distributing work amongst more than one server. This will need reflexion on how message are passed between different nodes, what really runs on each node, and how we optimize topology to avoid bandwidth problems.
Likely to build: high, lot of specs work before.
Bonobo is already useable in Jupyter, and shows a litte widget. It can easily display the transformation graph while executing, and there is maybe more to do (like step debugger ...) that would be really handy.
Likely to build: high
to be continued...