Scroll to navigation

GDAL-PIPELINE(1) GDAL GDAL-PIPELINE(1)

NAME

gdal-pipeline - Process a dataset applying several steps

Added in version 3.12.

DESCRIPTION

gdal pipeline execute a pipeline, taking a raster or input dataset, execute steps and finally writing a raster or vector dataset.

Most steps proceed in on-demand evaluation of raster blocks or features, unless otherwise stated in their documentation, without "materializing" the resulting dataset of the operation of each step. It may be desirable sometimes for performance purposes to proceed to materializing an intermediate dataset to disk using gdal raster materialize or gdal vector materialize.

SYNOPSIS

Usage: gdal pipeline [OPTIONS] <PIPELINE>
Process a dataset applying several steps.
Positional arguments:
Common Options:

-h, --help Display help message and exit
--json-usage Display usage as JSON document and exit
--config <KEY>=<VALUE> Configuration option [may be repeated]
-q, --quiet Quiet mode (no progress bar) Advanced Options:
--<user-provided-option>=<value> Argument provided by user <PIPELINE> is of the form: read|calc|concat|mosaic|stack [READ-OPTIONS] ( ! <STEP-NAME> [STEP-OPTIONS] )* ! write!info!tile [WRITE-OPTIONS]


A pipeline chains several steps, separated with the ! (exclamation mark) character. The first step must be read, calc, concat, mosaic or stack, and the last one info, tile or write. Each step has its own positional or non-positional arguments. Apart from read, calc, concat, mosaic, stack, info, tile, partition and write, all other steps can potentially be used several times in a pipeline.

Example 1: Compute the footprint of a raster and apply a buffer on the footprint

$ gdal pipeline ! read in.tif ! footprint ! buffer 20 ! write out.gpkg --overwrite


For steps that have both raster data type as input and output, consult gdal raster pipeline. For steps that have both vector data type as input and output, consult gdal vector pipeline.

The following steps accept raster input and generate vector output:

contour

* contour [OPTIONS]
-------------------
Creates a vector contour from a raster elevation model (DEM).
Options:

-b, --band <BAND> Input band (1-based index) (default: 1)
--elevation-name <ELEVATION-NAME> Name of the elevation field
--min-name <MIN-NAME> Name of the minimum elevation field
--max-name <MAX-NAME> Name of the maximum elevation field
--3d Force production of 3D vectors instead of 2D
--src-nodata <SRC-NODATA> Input pixel value to treat as 'nodata'
--interval <INTERVAL> Elevation interval between contours
Mutually exclusive with --levels, --exp-base
--levels <LEVELS> List of contour levels [may be repeated]
Mutually exclusive with --interval, --exp-base
-e, --exp-base <EXP-BASE> Base for exponential contour level generation
Mutually exclusive with --interval, --levels
--off, --offset <OFFSET> Offset to apply to contour levels
-p, --polygonize Create polygons instead of lines
--group-transactions <GROUP-TRANSACTIONS> Group n features per transaction (default 100 000)


Details for options can be found in gdal raster contour.

footprint

* footprint [OPTIONS]
---------------------
Compute the footprint of a raster dataset.
Options:

--output-layer <OUTPUT-LAYER> Output layer name (default: footprint)
-b, --band <BAND> Input band(s) (1-based index) [may be repeated]
--combine-bands <COMBINE-BANDS> Defines how the mask bands of the selected bands are combined to generate a single mask band, before being vectorized.. COMBINE-BANDS=union|intersection (default: union)
--overview <OVERVIEW> Which overview level of source file must be used
Mutually exclusive with --src-nodata
--src-nodata <SRC-NODATA> Set nodata values for input bands. [1.. values]
Mutually exclusive with --overview
--coordinate-system <COORDINATE-SYSTEM> Target coordinate system. COORDINATE-SYSTEM=georeferenced|pixel
--dst-crs <DST-CRS> Destination CRS
--split-multipolygons Whether to split multipolygons as several features each with one single polygon
--convex-hull Whether to compute the convex hull of the footprint
--densify-distance <DENSIFY-DISTANCE> Maximum distance between 2 consecutive points of the output geometry.
--simplify-tolerance <SIMPLIFY-TOLERANCE> Tolerance used to merge consecutive points of the output geometry.
--min-ring-area <MIN-RING-AREA> Minimum value for the area of a ring
--max-points <MAX-POINTS> Maximum number of points of each output geometry (default: 100)
--location-field <LOCATION-FIELD> Name of the field where the path of the input dataset will be stored. (default: location)
Mutually exclusive with --no-location-field
--no-location-field Disable creating a field with the path of the input dataset
Mutually exclusive with --location-field
--absolute-path Whether the path to the input dataset should be stored as an absolute path


Details for options can be found in gdal raster footprint.

polygonize

* polygonize [OPTIONS]
----------------------
Create a polygon feature dataset from a raster band.
Options:

-b, --band <BAND> Input band (1-based index) (default: 1)
--attribute-name <ATTRIBUTE-NAME> Name of the field with the pixel value (default: DN)
-c, --connect-diagonal-pixels Consider diagonal pixels as connected


Details for options can be found in gdal raster polygonize.

The following steps accept raster vector and generate raster output:

grid

* grid <COMMAND> [OPTIONS]
--------------------------
where <COMMAND> is one of:

- average: Create a regular grid from scattered points using moving average interpolation.
- average-distance: Create a regular grid from scattered points using the average distance between the grid node (center of the search ellipse) and all of the data points in the search ellipse.
- average-distance-points: Create a regular grid from scattered points using the average distance between the data points in the search ellipse.
- count: Create a regular grid from scattered points using the number of points in the search ellipse.
- invdist: Create a regular grid from scattered points using weighted inverse distance interpolation.
- invdistnn: Create a regular grid from scattered points using weighted inverse distance interpolation nearest neighbour.
- linear: Create a regular grid from scattered points using linear/barycentric interpolation.
- maximum: Create a regular grid from scattered points using the maximum value in the search ellipse.
- minimum: Create a regular grid from scattered points using the minimum value in the search ellipse.
- nearest: Create a regular grid from scattered points using nearest neighbor interpolation.
- range: Create a regular grid from scattered points using the difference between the minimum and maximum values in the search ellipse.


Details for options can be found in gdal vector grid.

rasterize

* rasterize [OPTIONS]
---------------------
Burns vector geometries into a raster.
Options:

-b, --band <BAND> The band(s) to burn values into (1-based index) [may be repeated]
--invert Invert the rasterization
--all-touched Enables the ALL_TOUCHED rasterization option
--burn <BURN> Burn value [may be repeated]
-a, --attribute-name <ATTRIBUTE-NAME> Attribute name
--3d Indicates that a burn value should be extracted from the Z values of the feature
-l, --input-layer <INPUT-LAYER> Input layer name
Mutually exclusive with --sql
--where <WHERE> SQL where clause
--sql <SQL> SQL select statement
Mutually exclusive with --input-layer
--dialect <DIALECT> SQL dialect
--nodata <NODATA> Assign a specified nodata value to output bands
--init <INIT> Pre-initialize output bands with specified value [may be repeated]
--crs <CRS> Override the projection for the output file
--transformer-option <NAME>=<VALUE> Set a transformer option suitable to pass to GDALCreateGenImgProjTransformer2 [may be repeated]
--extent <xmin>,<ymin>,<xmax>,<ymax> Set the target georeferenced extent [4 values]
--resolution <xres>,<yres> Set the target resolution [2 values]
Mutually exclusive with --size
--tap, --target-aligned-pixels (target aligned pixels) Align the coordinates of the extent of the output file to the values of the resolution
--size <xsize>,<ysize> Set the target size in pixels and lines [2 values]
Mutually exclusive with --resolution
--ot, --datatype, --output-data-type <OUTPUT-DATA-TYPE> Output data type. OUTPUT-DATA-TYPE=Byte|Int8|UInt16|Int16|UInt32|Int32|UInt64|Int64|CInt16|CInt32|Float16|Float32|Float64|CFloat32|CFloat64
--optimization <OPTIMIZATION> Force the algorithm used (results are identical). OPTIMIZATION=AUTO|RASTER|VECTOR (default: AUTO)


Details for options can be found in gdal vector rasterize.

tee

* tee [OPTIONS] [<PIPELINE>...]
-------------------------------
Pipes the input into the output stream and side nested pipelines.
Positional arguments:

--tee-pipeline <PIPELINE> Nested pipeline [1.. values]


Details for options can be found in Output nested pipeline.

GDALG OUTPUT (ON-THE-FLY / STREAMED DATASET)

A pipeline can be serialized as a JSON file using the GDALG output format. The resulting file can then be opened as a dataset using the GDALG: GDAL Streamed Algorithm or GDALG: GDAL Streamed Algorithm driver, and apply the specified pipeline in a on-the-fly / streamed way.

The command_line member of the JSON file should nominally be the whole command line without the final write step, and is what is generated by gdal pipeline ! .... ! write out.gdalg.json.

{

"type": "gdal_streamed_alg",
"command_line": "gdal pipeline ! read in.tif ! footprint ! buffer 20" }


The final write step can be added but if so it must explicitly specify the stream output format and a non-significant output dataset name.

{

"type": "gdal_streamed_alg",
"command_line": "gdal pipeline ! read in.tif ! footprint ! buffer 20 ! write --output-format=streamed streamed_dataset" }


SUBSTITUTIONS

It is also possible to use gdal pipeline to use a pipeline already serialized in a .gdalg.json file, and customize its existing steps, typically changing an input filename, specifying an output filename, or adding/modifying arguments of steps.

The syntax is:

gdal pipeline <filename.gdalg.json> --<step-name>.<arg-name>=value


When specifying an existing argument of a step of a pipeline, the value from the pipeline is overridden by the one specified on the gdal pipeline command line.

Let's imagine we have a raster_reproject.gdalg.json with the following content:

{

"type": "gdal_streamed_alg",
"command_line": "gdal pipeline ! read in.tif ! reproject --dst-crs=EPSG:4326 ! edit --metadata=CHANGES=reprojected" }


It is possible to run it with the following command line, overriding the input argument of the read step, and implicitly adding a final write step with an output argument.

$ gdal pipeline raster_reproject.gdalg.json --read.input=other_input.tif --write.output=out.tif


When there is no ambiguity, it is also possible to omit the step name, and just specify the argument name (if there is an ambiguity, gdal pipeline will emit an error, so this is safe to do):

$ gdal pipeline raster_reproject.gdalg.json --input=other_input.tif --output=out.tif --co COMPRESS=LZW --overwrite


When a step appears several times in the pipeline, it must specified as <step-name>[<idx>], where <idx> is a zero-based index.

For example, given:

{

"type": "gdal_streamed_alg",
"command_line": "gdal pipeline ! read in.tif ! edit --metadata=before=value ! reproject --dst-crs=EPSG:4326 ! edit --metadata=CHANGES=reprojected" }


the following command line may be used:

$ gdal pipeline raster_reproject.gdalg.json --edit[0].metadata=before=modified --output=out.tif


Execution of pipelines and argument substitutions can also be done in Python with:

gdal.Run("pipeline", pipeline="raster_reproject.gdalg.json", output="out.tif", arguments={"edit[0].metadata": "before=modified"})


NESTED PIPELINE

It is possible to create "nested pipelines", i.e. pipelines inside pipelines.

A nested pipeline is delimited by square brackets ([ and ]) surrounded by a space character.

There are 2 kinds of nested pipelines:

  • input nested pipelines: where the result dataset of the nested pipeline is used as the input dataset for an argument of the main pipeline.
  • output nested pipelines: where the output of a step of the main pipeline is used as the input of the nested pipeline in a following step. Output nested pipelines can only be used with the tee step.

Input nested pipeline

Wherever an input dataset is expected as an auxiliary dataset, it is possible to specify it as the result of a nested pipeline. The content of an input nested pipeline is identical to the outer pipeline, except it must not end with an output-generating step like info, tile or write

Example 2: Combine the output of shaded relief map and hypsometric rendering on a DEM to create a colorized shaded relief map.

$ gdal pipeline read n43.tif ! \

color-map --color-map color_file.txt ! \
color-merge --grayscale \
[ read n43.tif ! hillshade -z 30 ] ! \
write out.tif --overwrite


In the above example, the value of the grayscale argument of the color-merge step is set as the output of the nested pipeline read n43.tif ! hillshade -z 30.

Output nested pipeline

The tee step in a pipeline forwards the input dataset as its output, and additionally executes one or several nested pipelines that take this input dataset as input and do other processing to eventually write the output of that processing. The first step of a tee output nested pipeline must not be read, calc, concat, mosaic or stack, and its last step must be write or tile. The tee operator can be either used in the middle of a pipeline or as its last step.

The below example shows an example where the tee operator executes two output nested pipelines.

Example 3: Split the content of a "cities" layer according to whether its population is below or above 1 million.

$ gdal pipeline read cities.gpkg ! \

tee [ filter --where "pop < 1e6" ! write small_cities.gpkg ] \
[ filter --where "pop >= 1e6" ! write big_cities.gpkg ]


The below example shows a more complicated use case, including two occurrences of tee, with one of them being an output nested pipeline inside an input nested pipeline.

Example 4: Combine the output of shaded relief map and hypsometric rendering on

a DEM to create a colorized shaded relief map, and write intermediate hillshade and colorized dataset

$ gdal pipeline read n43.tif ! \

color-map --color-map color_file.txt ! \
tee [ write colored.tif --overwrite ] ! \
color-merge --grayscale \
[ read n43.tif ! hillshade -z 30 ! tee [ write hillshade.tif --overwrite ] ] ! \
write colored-hillshade.tif --overwrite


EXAMPLES

Example 5: Compute the footprint of a raster and apply a buffer on the footprint

$ gdal pipeline ! read in.tif ! footprint ! buffer 20 ! write out.gpkg --overwrite


Example 6: Rasterize and reproject

$ gdal pipeline ! read in.gpkg ! rasterize --size 1000,1000 ! reproject --dst-crs EPSG:4326 ! write out.tif --overwrite


Example 7: Use an existing pipeline that rasterizes and reprojects, but change its input file and target CRS, and specify the output file

$ gdal pipeline raster_reproject.gdalg.json --input=my.gpkg --output=out.tif --dst-crs=EPSG:32631


AUTHOR

Even Rouault <even.rouault@spatialys.com>

COPYRIGHT

1998-2025

November 7, 2025