Tutorial

Extract SEO Metadata from Any URL with APIShard

Build a URL metadata scraper using APIShard's SEO Metadata Extractor. Extract titles, descriptions, Open Graph tags, and more with practical code examples.

A

APIShard Team

March 25, 2026 · 5 min read

Extract SEO Metadata from Any URL with APIShard

SEO metadata—title tags, meta descriptions, Open Graph images—lives hidden in the <head> of every web page. As a backend developer, you might need to extract this metadata for link previews, web scraping, content aggregation, or SEO analysis. Parsing HTML manually is fragile; crawling JavaScript-heavy sites is expensive. The APIShard SEO Metadata Extractor solves this elegantly.

This tutorial walks you through building a real-world metadata scraper that extracts and stores SEO metadata in bulk.

The Problem

Imagine you're building a link-sharing platform or a content discovery tool. When a user pastes a URL, you want to show a rich preview:

🔗 Example Article
This is the article description that displays in social media and search results.
[Preview Image]

Pulling this data reliably requires:

  • Parsing HTML (<title>, <meta name="description">)
  • Extracting Open Graph tags for social sharing
  • Handling Twitter Cards for platform-specific previews
  • Dealing with canonical URLs and redirects
  • Not breaking when a site uses dynamic rendering

The APIShard SEO Metadata Extractor handles all of this, leaving you free to focus on your application logic.

Getting Started

Prerequisites

You'll need:

  • An APIShard API key (get one free)
  • Node.js 18+ (or TypeScript)
  • A basic HTTP client (axios, fetch, etc.)

Installation

npm install axios

Or use the built-in fetch:

// No additional packages needed if using Node 18+
const response = await fetch("https://api.apishard.com/v1/seo", {
  method: "POST",
  headers: {
    "X-API-KEY": process.env.APISHARD_API_KEY,
    "Content-Type": "application/json",
  },
  body: JSON.stringify({ url: "https://example.com" }),
});

Extract Metadata from a Single URL

Let's start simple. Here's how to extract metadata from one URL:

import axios from "axios";

const API_KEY = process.env.APISHARD_API_KEY;
const API_BASE = "https://api.apishard.com";

async function extractMetadata(url: string) {
  try {
    const response = await axios.post(`${API_BASE}/v1/seo`, 
      { url },
      {
        headers: {
          "X-API-KEY": API_KEY,
          "Content-Type": "application/json",
        },
      }
    );

    const data = response.data;
    console.log("URL:", data.url);
    console.log("Canonical:", data.canonical || "(none)");
    console.log("Title:", data.metadata.title);
    console.log("Description:", data.metadata.description);
    console.log("OG Image:", data.metadata.ogImage);
    console.log("Keywords:", data.keywords.join(", "));
    console.log(
      "Credits used:",
      response.headers["x-credit-used"]
    );

    return data;
  } catch (error) {
    if (error.response?.status === 401) {
      console.error("Invalid API key");
    } else if (error.response?.status === 403) {
      console.error("Insufficient credits");
    } else if (error.response?.status === 400) {
      console.error("Invalid URL:", error.response.data);
    } else {
      console.error("Error:", error.message);
    }
    throw error;
  }
}

// Usage
extractMetadata("https://github.com");

What you get back:

{
  "url": "https://github.com",
  "canonical": "https://github.com/",
  "metadata": {
    "title": "GitHub: Let's build from here",
    "description": "GitHub is where over 100 million developers shape the future of software...",
    "ogTitle": "GitHub",
    "ogDescription": "GitHub is where over 100 million developers...",
    "ogImage": "https://github.githubassets.com/images/modules/open_graph/github-logo.png",
    "ogType": "website",
    "twitterCard": "summary_large_image",
    "twitterTitle": "GitHub",
    "twitterDescription": "GitHub is where over 100 million developers...",
    "twitterImage": "https://github.githubassets.com/images/modules/open_graph/github-logo.png",
    "favicon": "https://github.githubassets.com/favicons/favicon.svg",
    "lang": "en"
  },
  "keywords": ["development", "software", "code", ...]
}

Credit cost: 5 credits per request (check the X-Credit-Used header).

Build a Batch Metadata Scraper

Now let's scale it. Here's a practical example: scrape metadata for a list of URLs and store the results in a database.

import axios from "axios";
import sqlite3 from "sqlite3";

const API_KEY = process.env.APISHARD_API_KEY;
const API_BASE = "https://api.apishard.com";

// Initialize SQLite database
const db = new sqlite3.Database("metadata.db");

db.run(`
  CREATE TABLE IF NOT EXISTS link_previews (
    id INTEGER PRIMARY KEY,
    url TEXT UNIQUE,
    canonical TEXT,
    title TEXT,
    description TEXT,
    og_image TEXT,
    og_type TEXT,
    twitter_card TEXT,
    favicon TEXT,
    language TEXT,
    keywords TEXT,
    extracted_at DATETIME DEFAULT CURRENT_TIMESTAMP
  )
`);

async function scrapeAndStore(urls: string[]) {
  let successCount = 0;
  let failureCount = 0;

  for (const url of urls) {
    try {
      const response = await axios.post(`${API_BASE}/v1/seo`, 
        { url },
        {
          headers: {
            "X-API-KEY": API_KEY,
            "Content-Type": "application/json",
          },
        }
      );

      const data = response.data;

      // Insert into database
      db.run(
        `INSERT OR REPLACE INTO link_previews 
         (url, canonical, title, description, og_image, og_type, twitter_card, favicon, language, keywords)
         VALUES (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)`,
        [
          data.url,
          data.canonical,
          data.metadata.title,
          data.metadata.description,
          data.metadata.ogImage,
          data.metadata.ogType,
          data.metadata.twitterCard,
          data.metadata.favicon,
          data.metadata.lang,
          JSON.stringify(data.keywords),
        ]
      );

      successCount++;
      console.log(`✓ Scraped: ${url}`);
    } catch (error) {
      failureCount++;
      console.error(`✗ Failed: ${url}`, error.message);
    }
  }

  console.log(
    `\nDone! Success: ${successCount}, Failed: ${failureCount}`
  );
}

// Usage
const urls = [
  "https://github.com",
  "https://developer.mozilla.org",
  "https://nodejs.org",
  "https://www.rust-lang.org",
];

scrapeAndStore(urls);

Real-World Pattern: Link Preview Service

Here's a production-ready service that combines API calls with caching:

import axios from "axios";
import NodeCache from "node-cache";

const API_KEY = process.env.APISHARD_API_KEY;
const API_BASE = "https://api.apishard.com";

// Cache metadata for 7 days (in seconds)
const cache = new NodeCache({ stdTTL: 604800 });

interface Metadata {
  title: string | null;
  description: string | null;
  image: string | null;
  url: string;
}

async function getLinkPreview(url: string): Promise<Metadata> {
  // Check cache first
  const cached = cache.get<Metadata>(url);
  if (cached) {
    console.log("Cache hit for:", url);
    return cached;
  }

  try {
    const response = await axios.post(
      `${API_BASE}/v1/seo`,
      { url },
      {
        headers: {
          "X-API-KEY": API_KEY,
          "Content-Type": "application/json",
        },
      }
    );

    const data = response.data;
    const metadata: Metadata = {
      title: data.metadata.title || data.metadata.ogTitle,
      description: data.metadata.description || data.metadata.ogDescription,
      image: data.metadata.ogImage,
      url: data.canonical || data.url,
    };

    // Cache the result
    cache.set(url, metadata);
    return metadata;
  } catch (error) {
    console.error("Failed to fetch metadata for:", url, error.message);
    throw error;
  }
}

// Example: Express route
import express from "express";
const app = express();

app.post("/api/preview", async (req, res) => {
  const { url } = req.body;

  if (!url) {
    return res.status(400).json({ error: "URL is required" });
  }

  try {
    const preview = await getLinkPreview(url);
    res.json(preview);
  } catch (error) {
    res.status(500).json({ error: "Failed to extract metadata" });
  }
});

app.listen(3000, () => console.log("Server running on port 3000"));

Error Handling

The API returns specific error codes. Handle them gracefully:

async function extractWithErrorHandling(url: string) {
  try {
    const response = await axios.post(`${API_BASE}/v1/seo`, 
      { url },
      {
        headers: {
          "X-API-KEY": API_KEY,
          "Content-Type": "application/json",
        },
      }
    );
    return response.data;
  } catch (error) {
    if (error.response?.status === 400) {
      console.error("Bad request - check your URL format");
    } else if (error.response?.status === 401) {
      console.error("Unauthorized - check your API key");
    } else if (error.response?.status === 403) {
      console.error("Out of credits - upgrade your plan");
    } else if (error.response?.status === 429) {
      console.error("Rate limit hit - wait before retrying");
    } else {
      console.error("Network error:", error.message);
    }
    throw error;
  }
}

Tips for Production

  1. Rate Limiting: The API enforces rate limits. Implement exponential backoff for retries.
  2. Caching: Cache results to avoid redundant API calls (like the example above).
  3. URL Validation: Validate URLs before sending to avoid wasting credits on invalid requests.
  4. Canonical URLs: Always use the canonical URL returned by the API—it's the true identity of the page.
  5. Fallbacks: If ogTitle is missing, fall back to title. Same for descriptions and images.
  6. Batch Processing: For many URLs, queue them and process in batches with delays to respect rate limits.

Summary

The SEO Metadata Extractor API makes it simple to extract structured metadata from any URL without writing fragile HTML parsers. Whether you're building link previews, a content aggregator, or an SEO analyzer, this approach scales reliably.

Next steps: