Pipeline Steps
Pipelines are made up of a series of steps - each step is a function that takes in a file and returns a file. The file can be modified in any way, including changing the data, adding metadata, or even changing the file type. Here is a list of pipeline step types that are available:Delete Columns
Use this step to fully drop columns from a file. For example, if a customer is sending anemail column but you do not want to receive that column in your data, you can use this step to drop it.
Rename Columns
Use this step to rename columns in a file. For example, if a customer is sending anemail column but you want to receive it as user_email, you can use this step to rename it. This is particularly useful if you are receiving data from multiple sources and want to standardize the column names.
Split Columns
Use this step to split a column into multiple columns based on a given delimiter. For example, if a customer is sending aname column but you want to receive it as first_name and last_name, you can use this step to split it based on a “space” character. Another example would be splitting an email column into a user and domain column.
Bonus - pair this with the
Delete Columns step to drop any parts of the
newly split column that you don’t want to keep!Rename File
Use this step to rename the file. For example, if a customer is sending a file namedusers.csv but you want to receive it as customers.csv, you can use this step to rename it. The new file name will be delivered as a part of the metadata payload inside of the JSON output, which your service can use to decide how to process the file. This is particularly useful if you need to standardize the file names across multiple inputs. (Note: this step does not change the file name on the server, only inside of the delivered JSON payload.)
Pipeline Input Settings
Pipelines can be configured to run on all files, or only on files that match a given set of criteria. Input settings can match on the following criteria:File Type- the file type of the file (e.g.csv,json,xml, etc.)Path- the path of the file (e.g./users.csv)
Equals- the file type or path must match exactlyNot Equals- the file type or path must not match exactlyOne Of- at least one of the provided criteria must match the file type or path
Pipeline Output Settings
When a pipeline has completed running, it can be configured to output the file. Currently we support outputting the file via webhook, which has the following settings:Webhook URL- the URL to send the file toOutput Type- the type of output to send, eitherJSONorCSV. If this is delivered viaJSON, the text file will be converted to JSON based on the headers in the CSV file.
metadata field, which contains the following information:
fileName- the name of the filepath- the full path of the filebucket_id- the ID of the bucket that the file was uploaded to