Cheerio 1.0 Released, Batteries Included ๐
Cheerio 1.0 is out! After 12 release candidates and just a short seven years after the initial 1.0 release candidate, it is finally time to call Cheerio 1.0 complete. The theme for this release is "batteries included", with common use cases now supported out of the box.
So grab a pair of double-As, and read below for what's new, what's changed, and how to upgrade!
New Website and Documentationโ
Since the last release, we've published a new website and documentation for Cheerio. The new site features detailed guides and API documentation to get the most from Cheerio. Check it out at cheerio.js.org.
A new way to load documentsโ
Loading documents into Cheerio has been revamped. Cheerio now supports multiple loading methods, each tailored to different use cases:
load
: The classic method for parsing HTML or XML strings.loadBuffer
: Works with binary data, automatically detecting the document encoding.stringStream
anddecodeStream
: Parse HTML directly from streams.fromURL
: Fetch and parse HTML from a URL in one go.
Dive deeper into these methods in the Loading Documents tutorial.
Simplified Data Extractionโ
The new extract
method allows you to extract data from an HTML document and
store it in an object. To fetch the latest release of Cheerio from GitHub and
extract the release date and the release notes from the release page is now as
simple as:
import * as cheerio from 'cheerio';
const $ = await cheerio.fromURL(
'https://github.com/cheeriojs/cheerio/releases',
);
const data = $.extract({
releases: [
{
// First, we select individual release sections.
selector: 'section',
// Then, we extract the release date, name, and notes from each section.
value: {
// Selectors are executed within the context of the selected element.
name: 'h2',
date: {
selector: 'relative-time',
// The actual release date is stored in the `datetime` attribute.
value: 'datetime',
},
notes: {
selector: '.markdown-body',
// We are looking for the HTML content of the element.
value: 'innerHTML',
},
},
},
],
});
Read more about all of the available options in the Extracting Data guide.
Breaking Changes and Upgrade Guideโ
Cheerio 1.0 introduces several breaking changes, most notably:
-
The minimum NodeJS version is now 18.17 or higher.
-
Import paths were simplified. For example, use
cheerio/slim
instead ofcheerio/lib/slim
. -
The deprecated default Cheerio instance and static methods were removed.
Before, it was possible to write code like this:
import cheerio, { html } from 'cheerio';
html(cheerio('<test></test>')); // ~ '<test></test>' -- NO LONGER WORKSMake sure to always load documents first:
import * as cheerio from 'cheerio';
cheerio.load('<test></test>').html(); -
htmlparser2 options now reside exclusively under the
xml
key:const $ = cheerio.load('<html>', {
xml: {
withStartIndices: true,
},
}); -
Node types previously re-exported by Cheerio must now be imported directly from
domhandler
.
For a comprehensive list of changes, please consult the changelog.
Upgrading to Cheerio 1.0โ
To upgrade to Cheerio 1.0, just run:
npm install cheerio@latest
Get Involvedโ
Explore the new features and let us know what you think! Encounter an issue? Report it on our GitHub issue tracker. Have an idea for an improvement? Pull requests welcome :)
Thank Youโ
Thanks to @jugglinmike for kick-starting Cheerio 1.0, and to all the contributors who have helped shape this release. We couldn't have done it without you.
Thanks to our sponsors and backers for supporting Cheerio's development. If you use Cheerio at work, consider asking your company to support us!
And finally, thank you for using Cheerio ๐๐โโ๏ธ