Scroll to navigation

GDAL-VECTOR-PIPELINE(1) GDAL GDAL-VECTOR-PIPELINE(1)

NAME

gdal-vector-pipeline - Process a vector dataset applying several steps

Added in version 3.11.

DESCRIPTION

gdal vector pipeline can be used to process a vector dataset and perform various processing steps that accept vector and generate vector.

For pipelines mixing raster and vector, consult gdal pipeline.

Most steps proceed in on-demand evaluation of features, unless otherwise stated in their documentation, without "materializing" the resulting dataset of the operation of each step. It may be desirable sometimes for performance purposes to proceed to materializing an intermediate dataset to disk using gdal vector materialize.

SYNOPSIS

Usage: gdal vector pipeline [OPTIONS] <PIPELINE>
Process a vector dataset applying several steps.
Positional arguments:
Common Options:

-h, --help Display help message and exit
--json-usage Display usage as JSON document and exit
--config <KEY>=<VALUE> Configuration option [may be repeated]
-q, --quiet Quiet mode (no progress bar) Options:
--skip-errors Skip errors when writing features <PIPELINE> is of the form: read|concat [READ-OPTIONS] ( ! <STEP-NAME> [STEP-OPTIONS] )* ! write|info [WRITE-OPTIONS]


A pipeline chains several steps, separated with the ! (exclamation mark) character. The first step must be read or concat, and the last one info, partition or write. Each step has its own positional or non-positional arguments. Apart from read, concat, info, partition and write, all other steps can potentially be used several times in a pipeline.

Potential steps are:

read

* read [OPTIONS] <INPUT>
------------------------
Read a vector dataset.
Positional arguments:

-i, --input <INPUT> Input vector datasets [required] Options:
-l, --layer, --input-layer <INPUT-LAYER> Input layer name(s) [may be repeated] Advanced Options:
--if, --input-format <INPUT-FORMAT> Input formats [may be repeated]
--oo, --open-option <KEY>=<VALUE> Open options [may be repeated]


buffer

* buffer [OPTIONS] <DISTANCE>
-----------------------------
Compute a buffer around geometries of a vector dataset.
Positional arguments:

--distance <DISTANCE> Distance to which to extend the geometry. [required] Options:
--active-layer <ACTIVE-LAYER> Set active layer (if not specified, all)
--active-geometry <ACTIVE-GEOMETRY> Geometry field name to which to restrict the processing (if not specified, all)
--endcap-style <ENDCAP-STYLE> Endcap style.. ENDCAP-STYLE=round|flat|square (default: round)
--join-style <JOIN-STYLE> Join style.. JOIN-STYLE=round|mitre|bevel (default: round)
--mitre-limit <MITRE-LIMIT> Mitre ratio limit (only affects mitered join style). (default: 5)
--quadrant-segments <QUADRANT-SEGMENTS> Number of line segments used to approximate a quarter circle. (default: 8)
--side <SIDE> Sets whether the computed buffer should be single-sided or not.. SIDE=both|left|right (default: both)


Details for options can be found in gdal vector buffer.

concat

* concat [OPTIONS] <INPUT>...
-----------------------------
Concatenate vector datasets.
Positional arguments:

-i, --input <INPUT> Input vector datasets [1.. values] [required] Options:
-l, --layer, --input-layer <INPUT-LAYER> Input layer name(s) [may be repeated]
--mode <MODE> Determine the strategy to create output layers from source layers . MODE=merge-per-layer-name|stack|single (default: merge-per-layer-name)
--output-layer <OUTPUT-LAYER> Name of the output vector layer (single mode), or template to name the output vector layers (stack mode)
--source-layer-field-name <SOURCE-LAYER-FIELD-NAME> Name of the new field to add to contain identificoncation of the source layer, with value determined from 'source-layer-field-content'
--source-layer-field-content <SOURCE-LAYER-FIELD-CONTENT> A string, possibly using {AUTO_NAME}, {DS_NAME}, {DS_BASENAME}, {DS_INDEX}, {LAYER_NAME}, {LAYER_INDEX}
--field-strategy <FIELD-STRATEGY> How to determine target fields from source fields. FIELD-STRATEGY=union|intersection (default: union)
-s, --src-crs <SRC-CRS> Source CRS
-d, --dst-crs <DST-CRS> Destination CRS Advanced Options:
--if, --input-format <INPUT-FORMAT> Input formats [may be repeated]
--oo, --open-option <KEY>=<VALUE> Open options [may be repeated]


Details for options can be found in gdal vector concat.

clip

* clip [OPTIONS]
----------------
Clip a vector dataset.
Options:

--active-layer <ACTIVE-LAYER> Set active layer (if not specified, all)
--bbox <BBOX> Clipping bounding box as xmin,ymin,xmax,ymax
Mutually exclusive with --geometry, --like
--bbox-crs <BBOX-CRS> CRS of clipping bounding box
--geometry <GEOMETRY> Clipping geometry (WKT or GeoJSON)
Mutually exclusive with --bbox, --like
--geometry-crs <GEOMETRY-CRS> CRS of clipping geometry
--like <DATASET> Dataset to use as a template for bounds
Mutually exclusive with --bbox, --geometry
--like-sql <SELECT-STATEMENT> SELECT statement to run on the 'like' dataset
Mutually exclusive with --like-where
--like-layer <LAYER-NAME> Name of the layer of the 'like' dataset
--like-where <WHERE-EXPRESSION> WHERE SQL clause to run on the 'like' dataset
Mutually exclusive with --like-sql


Details for options can be found in gdal vector clip.

edit

* edit [OPTIONS]
----------------
Edit metadata of a vector dataset.
Options:

--active-layer <ACTIVE-LAYER> Set active layer (if not specified, all)
--geometry-type <GEOMETRY-TYPE> Layer geometry type
--crs <CRS> Override CRS (without reprojection)
--metadata <KEY>=<VALUE> Add/update dataset metadata item [may be repeated]
--unset-metadata <KEY> Remove dataset metadata item [may be repeated]
--layer-metadata <KEY>=<VALUE> Add/update layer metadata item [may be repeated]
--unset-layer-metadata <KEY> Remove layer metadata item [may be repeated]
--unset-fid Unset the identifier of each feature and the FID column name


Details for options can be found in gdal vector edit.

explode-collections

* explode-collections [OPTIONS]
-------------------------------
Explode geometries of type collection of a vector dataset.
Options:

--active-layer <ACTIVE-LAYER> Set active layer (if not specified, all)
--active-geometry <ACTIVE-GEOMETRY> Geometry field name to which to restrict the processing (if not specified, all)
--geometry-type <GEOMETRY-TYPE> Geometry type
--skip-on-type-mismatch Skip feature when change of feature geometry type failed


Details for options can be found in gdal vector explode-collections.

filter

* filter [OPTIONS]
------------------
Filter a vector dataset.
Options:

--active-layer <ACTIVE-LAYER> Set active layer (if not specified, all)
--bbox <BBOX> Bounding box as xmin,ymin,xmax,ymax
--where <WHERE>|@<filename> Attribute query in a restricted form of the queries used in the SQL WHERE statement


Details for options can be found in gdal vector filter.

limit

* limit [OPTIONS] <LIMIT>
-------------------------
Truncate a vector dataset to no more than a specified number of features.
Positional arguments:

--limit <LIMIT> Limit the number of features to read per layer [required] Options:
--active-layer <ACTIVE-LAYER> Set active layer (if not specified, all)


make-valid

* make-valid [OPTIONS]
----------------------
Fix validity of geometries of a vector dataset.
Options:

--active-layer <ACTIVE-LAYER> Set active layer (if not specified, all)
--active-geometry <ACTIVE-GEOMETRY> Geometry field name to which to restrict the processing (if not specified, all)
--method <METHOD> Algorithm to use when repairing invalid geometries.. METHOD=linework|structure (default: linework)
--keep-lower-dim Keep components of lower dimension after MakeValid()


Details for options can be found in gdal vector make-valid.

materialize

* materialize [OPTIONS]
-----------------------
Materialize a piped dataset on disk to increase the efficiency of the following steps.
Options:

-o, --output <OUTPUT> Materialized dataset name (created by algorithm)
-f, --of, --format, --output-format <OUTPUT-FORMAT> Output format
--co, --creation-option <KEY>=<VALUE> Creation option [may be repeated]
--lco, --layer-creation-option <KEY>=<VALUE> Layer creation option [may be repeated]
--overwrite Whether overwriting existing output is allowed


Details for options can be found in gdal vector materialize.

reproject

* reproject [OPTIONS]
---------------------
Reproject a vector dataset.
Options:

--active-layer <ACTIVE-LAYER> Set active layer (if not specified, all)
-s, --src-crs <SRC-CRS> Source CRS
-d, --dst-crs <DST-CRS> Destination CRS [required]


Details for options can be found in gdal vector reproject.

segmentize

* segmentize [OPTIONS] <MAX-LENGTH>
-----------------------------------
Segmentize geometries of a vector dataset.
Positional arguments:

--max-length <MAX-LENGTH> Maximum length of a segment [required] Options:
--active-layer <ACTIVE-LAYER> Set active layer (if not specified, all)
--active-geometry <ACTIVE-GEOMETRY> Geometry field name to which to restrict the processing (if not specified, all)


Details for options can be found in gdal vector segmentize.

select

* select [OPTIONS] <FIELDS>
---------------------------
Select a subset of fields from a vector dataset.
Positional arguments:

--fields <FIELDS> Fields to select (or exclude if --exclude) [may be repeated] [required] Options:
--active-layer <ACTIVE-LAYER> Set active layer (if not specified, all)
--exclude Exclude specified fields
Mutually exclusive with --ignore-missing-fields
--ignore-missing-fields Ignore missing fields
Mutually exclusive with --exclude


Details for options can be found in gdal vector select.

set-field-type

* set-field-type [OPTIONS]
--------------------------
Modify the type of a field of a vector dataset.
Options:

--active-layer <ACTIVE-LAYER> Set active layer (if not specified, all)
--field-name <FIELD-NAME> Field name [required]
Mutually exclusive with --src-field-type
--src-field-type <SRC-FIELD-TYPE> Source field type or subtype [required]
Mutually exclusive with --field-name
--dst-field-type, --field-type <FIELD-TYPE> Target field type or subtype [required]


Details for options can be found in gdal vector set-field-type.

set-geom-type

* set-geom-type [OPTIONS]
-------------------------
Modify the geometry type of a vector dataset.
Options:

--active-layer <ACTIVE-LAYER> Set active layer (if not specified, all)
--active-geometry <ACTIVE-GEOMETRY> Geometry field name to which to restrict the processing (if not specified, all)
--layer-only Only modify the layer geometry type
Mutually exclusive with --feature-only
--feature-only Only modify the geometry type of features
Mutually exclusive with --layer-only
--geometry-type <GEOMETRY-TYPE> Geometry type
--multi Force geometries to MULTI geometry types
Mutually exclusive with --single
--single Force geometries to non-MULTI geometry types
Mutually exclusive with --multi
--linear Convert curve geometries to linear types
Mutually exclusive with --curve
--curve Convert linear geometries to curve types
Mutually exclusive with --linear
--dim <DIM> Force geometries to the specified dimension. DIM=XY|XYZ|XYM|XYZM
--skip Skip feature when change of feature geometry type failed


Details for options can be found in gdal vector set-geom-type.

simplify

* simplify [OPTIONS] <TOLERANCE>
--------------------------------
Simplify geometries of a vector dataset.
Positional arguments:

--tolerance <TOLERANCE> Distance tolerance for simplification. [required] Options:
--active-layer <ACTIVE-LAYER> Set active layer (if not specified, all)
--active-geometry <ACTIVE-GEOMETRY> Geometry field name to which to restrict the processing (if not specified, all)


Details for options can be found in gdal vector simplify.

simplify-coverage

* simplify-coverage [OPTIONS]
-----------------------------
Simplify shared boundaries of a polygonal vector dataset.
Options:

--active-layer <ACTIVE-LAYER> Set active layer (if not specified, all)
--tolerance <TOLERANCE> Distance tolerance for simplification. [required]
--preserve-boundary Whether the exterior boundary should be preserved.


Details for options can be found in gdal vector simplify-coverage.

sql

* sql [OPTIONS] <statement>|@<filename>
---------------------------------------
Apply SQL statement(s) to a dataset.
Positional arguments:

--sql <statement>|@<filename> SQL statement(s) [may be repeated] [required] Options:
-l, --output-layer <OUTPUT-LAYER> Output layer name(s) [may be repeated]
--dialect <DIALECT> SQL dialect (e.g. OGRSQL, SQLITE)


Details for options can be found in gdal vector sql.

swap-xy

* swap-xy [OPTIONS]
-------------------
Swap X and Y coordinates of geometries of a vector dataset.
Options:

--active-layer <ACTIVE-LAYER> Set active layer (if not specified, all)
--active-geometry <ACTIVE-GEOMETRY> Geometry field name to which to restrict the processing (if not specified, all)


Details for options can be found in gdal vector swap-xy.

info

Added in version 3.12.

* info [OPTIONS]
----------------
Return information on a vector dataset.
Options:

-f, --of, --format, --output-format <OUTPUT-FORMAT> Output format. OUTPUT-FORMAT=json|text
-l, --layer, --input-layer <INPUT-LAYER> Input layer name [may be repeated]
Mutually exclusive with --sql
--features List all features (beware of RAM consumption on large layers)
Mutually exclusive with --summary
--summary List the layer names and the geometry type
Mutually exclusive with --features
--limit <FEATURE-COUNT> Limit the number of features per layer (implies --features)
--sql <statement>|@<filename> Execute the indicated SQL statement and return the result
Mutually exclusive with --input-layer
--where <WHERE>|@<filename> Attribute query in a restricted form of the queries used in the SQL WHERE statement
--dialect <DIALECT> SQL dialect


Details for options can be found in gdal vector info.

partition

Added in version 3.12.

* partition [OPTIONS] <OUTPUT>
------------------------------
Partition a vector dataset into multiple files.
Positional arguments:

-o, --output <OUTPUT> Output directory [required] Options:
--overwrite Whether overwriting existing output is allowed
Mutually exclusive with --append
--append Whether appending to existing layer is allowed
Mutually exclusive with --overwrite
-f, --of, --format, --output-format <OUTPUT-FORMAT> Output format
--co, --creation-option <KEY>=<VALUE> Creation option [may be repeated]
--lco, --layer-creation-option <KEY>=<VALUE> Layer creation option [may be repeated]
--field <FIELD> Field(s) on which to partition [may be repeated] [required]
--scheme <SCHEME> Partitioning scheme. SCHEME=hive|flat (default: hive)
--pattern <PATTERN> Filename pattern ('part_%010d' for scheme=hive, '{LAYER_NAME}_{FIELD_VALUE}_%010d' for scheme=flat)
--feature-limit <FEATURE-LIMIT> Maximum number of features per file
--max-file-size <MAX-FILE-SIZE> Maximum file size (MB or GB suffix can be used)
--omit-partitioned-field Whether to omit partitioned fields from target layer definition
--skip-errors Skip errors when writing features


Details for options can be found in gdal vector partition.

write

* write [OPTIONS] <OUTPUT>
--------------------------
Write a vector dataset.
Positional arguments:

-o, --output <OUTPUT> Output vector dataset [required] Options:
-f, --of, --format, --output-format <OUTPUT-FORMAT> Output format ("GDALG" allowed)
--co, --creation-option <KEY>=<VALUE> Creation option [may be repeated]
--lco, --layer-creation-option <KEY>=<VALUE> Layer creation option [may be repeated]
--overwrite Whether overwriting existing output is allowed
--update Whether to open existing dataset in update mode
--overwrite-layer Whether overwriting existing output is allowed
--append Whether appending to existing layer is allowed
Mutually exclusive with --upsert
-l, --output-layer <OUTPUT-LAYER> Output layer name
--skip-errors Skip errors when writing features Advanced Options:
--output-oo, --output-open-option <KEY>=<VALUE> Output open options [may be repeated]
--upsert Upsert features (implies 'append')
Mutually exclusive with --append


GDALG OUTPUT (ON-THE-FLY / STREAMED DATASET)

A pipeline can be serialized as a JSON file using the GDALG output format. The resulting file can then be opened as a vector dataset using the GDALG: GDAL Streamed Algorithm driver, and apply the specified pipeline in a on-the-fly / streamed way.

The command_line member of the JSON file should nominally be the whole command line without the final write step, and is what is generated by gdal vector pipeline ! .... ! write out.gdalg.json.

{

"type": "gdal_streamed_alg",
"command_line": "gdal vector pipeline ! read in.gpkg ! reproject --dst-crs=EPSG:32632" }


The final write step can be added but if so it must explicitly specify the stream output format and a non-significant output dataset name.

{

"type": "gdal_streamed_alg",
"command_line": "gdal vector pipeline ! read in.gpkg ! reproject --dst-crs=EPSG:32632 ! write --output-format=streamed streamed_dataset" }


SUBSTITUTIONS

Added in version 3.12.

It is possible to use gdal pipeline to use a pipeline already serialized in a .gdal.json file, and customize its existing steps, typically changing an input filename, specifying an output filename, or adding/modifying arguments of steps.

See Substitutions.

NESTED PIPELINE

Added in version 3.12.

It is possible to create "nested pipelines", i.e. pipelines inside pipelines.

A nested pipeline is delimited by square brackets ([ and ]) surrounded by a space character.

There are 2 kinds of nested pipelines:

  • input nested pipelines: where the result dataset of the nested pipeline is used as the input dataset for an argument of the main pipeline.
  • output nested pipelines: where the output of a step of the main pipeline is used as the input of the nested pipeline in a following step. Output nested pipelines can only be used with the tee step.

See Nested pipeline.

EXAMPLES

Example 1: Reproject a GeoPackage file to CRS EPSG:32632 ("WGS 84 / UTM zone 32N")

$ gdal vector pipeline ! read in.gpkg ! reproject --dst-crs=EPSG:32632 ! write out.gpkg --overwrite


Example 2: Serialize the command of a reprojection of a GeoPackage file in a GDALG file, and later read it

$ gdal vector pipeline ! read in.gpkg ! reproject --dst-crs=EPSG:32632 ! write in_epsg_32632.gdalg.json --overwrite
$ gdal vector info in_epsg_32632.gdalg.json


Example 3: None

Union 2 source shapefiles (with similar structure), reproject them to EPSG:32632, keep only cities larger than 1 million inhabitants and write to a GeoPackage

$ gdal vector pipeline ! concat --single --dst-crs=EPSG:32632 france.shp belgium.shp ! filter --where "pop > 1e6" ! write out.gpkg --overwrite


AUTHOR

Even Rouault <even.rouault@spatialys.com>

COPYRIGHT

1998-2025

October 20, 2025