top of page

Getting Started with Rasterio: Read, Write, and Analyze Geospatial Rasters

  • Writer: Anvita Shrivastava
    Anvita Shrivastava
  • 1 hour ago
  • 4 min read

Geospatial raster data is an essential component of many different types of analysis—from satellite imagery and land use classification, to digital elevation models (DEM) and weather forecasting. For developers and GIS professionals using Python, Rasterio is a highly regarded library for working with raster as well as raster-based data.


Rasterio is built upon the GDAL database system and provides a simpler and more developer-friendly interface for performing raster operations with high-performance geospatial processing capabilities.


Getting Started with Rasterio
Getting Started with Rasterio

What is Rasterio?


Rasterio is a free, open-source library developed in Python for reading and manipulating geospatial raster datasets. The library facilitates the integration of geospatial rasters with current middleware technology, such as NumPy, by allowing for easy, high-performance analysis and manipulation of raster data.


Rasterio supports a variety of geospatial raster datasets based on the 'GDAL' (Geospatial Data Abstraction Library) specification, including support for:


  • TIFF

  • IMG

  • NETCDF

  • JPEG2000

  • PNG

  • CLOUD OPTIMIZED GEOTIFFS (COG).


Unlike raw GDAL bindings, Rasterio provides a simple interface to geospatial datasets while giving developers access to many complex geospatial capabilities.


Why Use Rasterio for Geospatial Analysis?


Rasterio is used extensively by both developers and users due to its equal emphasis on simplicity and performance.


Key Features


Python API

Rasterio streamlines complex geospatial processes through the use of common Python concepts.


NumPy Support

Raster bands can be directly converted to NumPy arrays for scientific computing and machine learning tasks with ease.


Automatic CRS & Transforms

Both the coordinate reference systems and affine transforms will be automatically managed within Rasterio.


Windowed Reading

Rasterio supports partial raster access, which is important for working with large datasets or when performing cloud-based processes.


Cloud-Native Support

Rasterio works seamlessly with cloud-based geospatial data, including COGs that live in an object storage environment.


Installing Rasterio


Rasterio can be installed using pip or conda.


Install with pip

pip install rasterio

Install with conda

conda install -c conda-forge rasterio

Using Conda is often recommended because it automatically resolves GDAL dependencies.


To verify the installation:

import rasterioprint(rasterio.__version__)

Reading Raster Files with Rasterio


Reading raster datasets is straightforward using Rasterio’s context manager.


Open a Raster File

import rasterioraster_path = "example.tif"with rasterio.open(raster_path) as src:    print(src.name)

The dataset object provides access to raster metadata and pixel data.


Inspecting Raster Metadata


Raster metadata contains essential spatial information.


Access Raster Properties

with rasterio.open(raster_path) as src:    print("Width:", src.width)    print("Height:", src.height)    print("Bands:", src.count)    print("CRS:", src.crs)    print("Bounds:", src.bounds)    print("Data Type:", src.dtypes)

Example output:

Width: 2048Height: 2048Bands: 3CRS: EPSG:4326Bounds: BoundingBox(left=-180, bottom=-90, right=180, top=90)

These metadata properties are critical for accurate spatial analysis.


Reading Raster Bands


Raster datasets may contain multiple bands.


Read a Single Band

with rasterio.open(raster_path) as src:    band1 = src.read(1)print(band1.shape)

Rasterio returns data as a NumPy array.


Reading Multiple Bands

with rasterio.open(raster_path) as src:    data = src.read()print(data.shape)

Typical output:

(3, 2048, 2048)

Where:

  • 3 = number of bands

  • 2048 = rows

  • 2048 = columns


Visualizing Raster Data


Rasterio works well with Matplotlib.


Display a Raster Band

import matplotlib.pyplot as pltwith rasterio.open(raster_path) as src:    band1 = src.read(1)plt.imshow(band1, cmap="terrain")plt.colorbar()plt.title("Raster Band 1")plt.show()

This is commonly used for DEM visualization and remote sensing analysis.


Writing Raster Files


Rasterio also supports raster creation and export workflows.


Create a New Raster

import numpy as npimport rasteriofrom rasterio.transform import from_origindata = np.random.randint(0, 255, (100, 100), dtype="uint8")transform = from_origin(    west=0,    north=100,    xsize=1,    ysize=1)with rasterio.open(    "output.tif",    "w",    driver="GTiff",    height=data.shape[0],    width=data.shape[1],    count=1,    dtype=data.dtype,    crs="EPSG:4326",    transform=transform) as dst:    dst.write(data, 1)

This creates a georeferenced GeoTIFF raster.


Working with Coordinate Reference Systems (CRS)


Coordinate systems are fundamental in GIS workflows.

Rasterio makes CRS management simple.


Check CRS

with rasterio.open(raster_path) as src:    print(src.crs)

Example:

EPSG:4326

Reprojecting Raster Data


Raster reprojection converts datasets between coordinate systems.


Reproject a Raster

from rasterio.warp import calculate_default_transform, reproject, Resamplingdst_crs = "EPSG:3857"with rasterio.open(raster_path) as src:    transform, width, height = calculate_default_transform(        src.crs,        dst_crs,        src.width,        src.height,        *src.bounds    )    kwargs = src.meta.copy()    kwargs.update({        "crs": dst_crs,        "transform": transform,        "width": width,        "height": height    })    with rasterio.open("reprojected.tif", "w", **kwargs) as dst:        for i in range(1, src.count + 1):            reproject(                source=rasterio.band(src, i),                destination=rasterio.band(dst, i),                src_transform=src.transform,                src_crs=src.crs,                dst_transform=transform,                dst_crs=dst_crs,                resampling=Resampling.nearest            )

Reprojection is essential when combining raster layers from different spatial reference systems.


Windowed Reading for Large Raster Files


Large rasters can exceed memory limits.

Rasterio supports efficient window-based processing.


Read a Raster Window

from rasterio.windows import Windowwith rasterio.open(raster_path) as src:    window = Window(        col_off=0,        row_off=0,        width=512,        height=512    )    subset = src.read(1, window=window)print(subset.shape)

Benefits include:

  • Lower memory usage

  • Faster processing

  • Cloud-optimized workflows

  • Distributed geospatial computing


Raster Analysis with NumPy


Because Rasterio integrates directly with NumPy, advanced raster analysis becomes straightforward.


Calculate Raster Statistics

import numpy as npwith rasterio.open(raster_path) as src:    band = src.read(1)print("Min:", np.min(band))print("Max:", np.max(band))print("Mean:", np.mean(band))

Masking NoData Values


Many rasters contain NoData regions.


Read with Masking

with rasterio.open(raster_path) as src:    band = src.read(1, masked=True)print(band.mask)

Masked arrays prevent invalid pixels from affecting analysis results.


Cropping Raster Data


Rasterio supports clipping rasters using vector geometries.


Crop Using a Polygon

from rasterio.mask import maskimport fionawith fiona.open("boundary.geojson") as shapefile:    shapes = [feature["geometry"] for feature in shapefile]with rasterio.open(raster_path) as src:    out_image, out_transform = mask(src, shapes, crop=True)

This is widely used in:

  • Environmental analysis

  • Urban planning

  • Precision agriculture

  • Hydrological modeling


Best Practices for Rasterio Workflows


Use Context Managers

Open Rasters Utilizing Rasterio.open().


Validate CRS Consistency

Verify All Datasets Have Compatible Coordinate Reference Systems.


Avoid Full Raster Loads

Use windows and chunking for scalability.


Store Metadata Carefully

Preserve geospatial metadata during exports.


Use Compression for Production

Recommended GeoTIFF compression options:

compress="lzw"

Rasterio is now regarded as one of the most significant Python libraries for accomplishing contemporary raster processing in the world of GIS and geospatial analyses. With its straightforward API, compatibility with NumPy, and scalable processing capabilities, it makes a great option for all types of people working in these fields, including GIS professionals (analysts), remote sensing analysts, and developers.


Once mastered, you can use Rasterio to perform fundamental tasks such as reading raster bands, creating GeoTIFF files, managing coordinate systems, and creating efficient raster processing workflows. Doing so sets you up to create valuable geospatial workflows that can assist in a variety of scientific research applications, environmental monitoring applications, machine learning applications, and enterprise GIS applications.


One of the many advantages offered by Rasterio is that, as geospatial datasets continue to grow in size and/or complexity, Rasterio has become an indispensable library for creating scalable, cloud-native raster processing pipelines.


For more information or any questions regarding the LizardTech suite of products, please don't hesitate to contact us at:



USA (HQ): (720) 702–4849


(A GeoWGS84 Corp Company)



Comments


bottom of page