Skip to main content

Cheerio 1.0 Released, Batteries Included ๐Ÿ”‹

ยท 3 min read
Felix Boehm
Maintainer of Cheerio

Cheerio 1.0 is out! After 12 release candidates and just a short seven years after the initial 1.0 release candidate, it is finally time to call Cheerio 1.0 complete. The theme for this release is "batteries included", with common use cases now supported out of the box.

So grab a pair of double-As, and read below for what's new, what's changed, and how to upgrade!

New Website and Documentationโ€‹

Since the last release, we've published a new website and documentation for Cheerio. The new site features detailed guides and API documentation to get the most from Cheerio. Check it out at cheerio.js.org.

A new way to load documentsโ€‹

Loading documents into Cheerio has been revamped. Cheerio now supports multiple loading methods, each tailored to different use cases:

  • load: The classic method for parsing HTML or XML strings.
  • loadBuffer: Works with binary data, automatically detecting the document encoding.
  • stringStream and decodeStream: Parse HTML directly from streams.
  • fromURL: Fetch and parse HTML from a URL in one go.

Dive deeper into these methods in the Loading Documents tutorial.

Simplified Data Extractionโ€‹

The new extract method allows you to extract data from an HTML document and store it in an object. To fetch the latest release of Cheerio from GitHub and extract the release date and the release notes from the release page is now as simple as:

import * as cheerio from 'cheerio';

const $ = await cheerio.fromURL(
'https://github.com/cheeriojs/cheerio/releases',
);
const data = $.extract({
releases: [
{
// First, we select individual release sections.
selector: 'section',
// Then, we extract the release date, name, and notes from each section.
value: {
// Selectors are executed within the context of the selected element.
name: 'h2',
date: {
selector: 'relative-time',
// The actual release date is stored in the `datetime` attribute.
value: 'datetime',
},
notes: {
selector: '.markdown-body',
// We are looking for the HTML content of the element.
value: 'innerHTML',
},
},
},
],
});

Read more about all of the available options in the Extracting Data guide.

Breaking Changes and Upgrade Guideโ€‹

Cheerio 1.0 introduces several breaking changes, most notably:

  • The minimum NodeJS version is now 18.17 or higher.

  • Import paths were simplified. For example, use cheerio/slim instead of cheerio/lib/slim.

  • The deprecated default Cheerio instance and static methods were removed.

    Before, it was possible to write code like this:

    import cheerio, { html } from 'cheerio';

    html(cheerio('<test></test>')); // ~ '<test></test>' -- NO LONGER WORKS

    Make sure to always load documents first:

    import * as cheerio from 'cheerio';

    cheerio.load('<test></test>').html();
  • htmlparser2 options now reside exclusively under the xml key:

    const $ = cheerio.load('<html>', {
    xml: {
    withStartIndices: true,
    },
    });
  • Node types previously re-exported by Cheerio must now be imported directly from domhandler.

For a comprehensive list of changes, please consult the changelog.

Upgrading to Cheerio 1.0โ€‹

To upgrade to Cheerio 1.0, just run:

npm install cheerio@latest

Get Involvedโ€‹

Explore the new features and let us know what you think! Encounter an issue? Report it on our GitHub issue tracker. Have an idea for an improvement? Pull requests welcome :)

Thank Youโ€‹

Thanks to @jugglinmike for kick-starting Cheerio 1.0, and to all the contributors who have helped shape this release. We couldn't have done it without you.

Thanks to our sponsors and backers for supporting Cheerio's development. If you use Cheerio at work, consider asking your company to support us!

And finally, thank you for using Cheerio ๐Ÿ™‡๐Ÿ™‡โ€โ™€๏ธ