Content Markdown Astro

Astro Content Collections Deep Dive

Master Astro's content collections feature for managing markdown content, type safety, and building powerful content-driven sites.

Lisa Wang
Lisa Wang
9 min read

Astro Content Collections Deep Dive

Content Collections are one of Astro’s most powerful features for managing content-driven websites. They provide type safety, validation, and excellent developer experience for working with markdown and other content formats. This comprehensive guide covers everything you need to know.

What are Content Collections?

Content Collections are Astro’s built-in solution for managing structured content like blog posts, documentation, product descriptions, and more. They provide:

  • Type Safety: Automatic TypeScript types for your content
  • Schema Validation: Ensure content follows defined structures
  • Performance: Optimized content loading and processing
  • Developer Experience: Excellent tooling and error messages

Setting Up Content Collections

Basic Configuration

Create a configuration file to define your collections:

// src/content/config.ts
import { defineCollection, z } from 'astro:content';

const blog = defineCollection({
  type: 'content',
  schema: z.object({
    title: z.string(),
    description: z.string(),
    publishDate: z.date(),
    author: z.string(),
    tags: z.array(z.string()),
    featured: z.boolean().default(false),
    draft: z.boolean().default(false),
  }),
});

export const collections = {
  blog,
};

Directory Structure

Organize your content in the src/content directory:

src/content/
├── config.ts
└── blog/
    ├── first-post.md
    ├── second-post.md
    └── advanced-guide/
        ├── index.md
        └── images/
            └── diagram.png

Content Files

Create markdown files with frontmatter:

---
title: "Getting Started with Astro"
description: "Learn the basics of building websites with Astro"
publishDate: 2024-01-15
author: "Lisa Wang"
tags: ["astro", "tutorial", "beginner"]
featured: true
---

# Getting Started with Astro

Welcome to this comprehensive guide on Astro...

Advanced Schema Definitions

Complex Data Types

// src/content/config.ts
import { defineCollection, z } from 'astro:content';

const blog = defineCollection({
  type: 'content',
  schema: ({ image }) => z.object({
    title: z.string(),
    description: z.string(),
    publishDate: z.date(),
    updatedDate: z.date().optional(),
    
    // Author information
    author: z.object({
      name: z.string(),
      email: z.string().email(),
      avatar: image(),
      bio: z.string().optional(),
    }),
    
    // SEO and social
    seo: z.object({
      title: z.string().optional(),
      description: z.string().optional(),
      keywords: z.array(z.string()).optional(),
      ogImage: image().optional(),
    }).optional(),
    
    // Content organization
    category: z.enum(['tutorial', 'guide', 'news', 'case-study']),
    tags: z.array(z.string()),
    series: z.string().optional(),
    
    // Content metadata
    readingTime: z.number().optional(),
    difficulty: z.enum(['beginner', 'intermediate', 'advanced']).default('beginner'),
    featured: z.boolean().default(false),
    draft: z.boolean().default(false),
    
    // Related content
    relatedPosts: z.array(z.string()).optional(),
    
    // Custom fields
    customData: z.record(z.any()).optional(),
  }),
});

const products = defineCollection({
  type: 'content',
  schema: ({ image }) => z.object({
    name: z.string(),
    description: z.string(),
    price: z.number().positive(),
    images: z.array(image()),
    category: z.string(),
    tags: z.array(z.string()),
    
    // Product specifications
    specifications: z.object({
      dimensions: z.object({
        width: z.number(),
        height: z.number(),
        depth: z.number(),
        unit: z.enum(['cm', 'in']).default('cm'),
      }).optional(),
      weight: z.object({
        value: z.number(),
        unit: z.enum(['kg', 'lb']).default('kg'),
      }).optional(),
      materials: z.array(z.string()).optional(),
    }).optional(),
    
    // Inventory
    sku: z.string(),
    inventory: z.number().nonnegative(),
    inStock: z.boolean().default(true),
    
    // Variants
    variants: z.array(z.object({
      id: z.string(),
      name: z.string(),
      price: z.number().positive(),
      sku: z.string(),
      attributes: z.record(z.string()),
    })).optional(),
  }),
});

export const collections = {
  blog,
  products,
};

Custom Validation

const blog = defineCollection({
  type: 'content',
  schema: z.object({
    title: z.string().min(1).max(100),
    slug: z.string().regex(/^[a-z0-9-]+$/),
    publishDate: z.date().refine(
      (date) => date <= new Date(),
      { message: "Publish date cannot be in the future" }
    ),
    tags: z.array(z.string()).min(1).max(5),
    readingTime: z.number().min(1).max(60),
  }),
});

Querying Content Collections

Basic Queries

---
// src/pages/blog/index.astro
import { getCollection } from 'astro:content';
import Layout from '../../layouts/Layout.astro';

// Get all blog posts
const allPosts = await getCollection('blog');

// Filter out drafts in production
const publishedPosts = allPosts.filter(post => !post.data.draft);

// Sort by publish date
const sortedPosts = publishedPosts.sort(
  (a, b) => b.data.publishDate.valueOf() - a.data.publishDate.valueOf()
);
---

<Layout title="Blog">
  <h1>Blog Posts</h1>
  {sortedPosts.map(post => (
    <article>
      <h2>
        <a href={`/blog/${post.slug}`}>{post.data.title}</a>
      </h2>
      <p>{post.data.description}</p>
      <time>{post.data.publishDate.toLocaleDateString()}</time>
    </article>
  ))}
</Layout>

Advanced Filtering

---
import { getCollection } from 'astro:content';

// Filter by category
const tutorials = await getCollection('blog', ({ data }) => {
  return data.category === 'tutorial' && !data.draft;
});

// Filter by tag
const astroTutorials = await getCollection('blog', ({ data }) => {
  return data.tags.includes('astro') && !data.draft;
});

// Filter by date range
const recentPosts = await getCollection('blog', ({ data }) => {
  const oneMonthAgo = new Date();
  oneMonthAgo.setMonth(oneMonthAgo.getMonth() - 1);
  return data.publishDate >= oneMonthAgo && !data.draft;
});

// Complex filtering
const featuredAdvancedPosts = await getCollection('blog', ({ data }) => {
  return data.featured && 
         data.difficulty === 'advanced' && 
         !data.draft;
});
---

Dynamic Routes

---
// src/pages/blog/[...slug].astro
import { getCollection } from 'astro:content';
import Layout from '../../layouts/Layout.astro';

export async function getStaticPaths() {
  const posts = await getCollection('blog');
  
  return posts.map(post => ({
    params: { slug: post.slug },
    props: { post },
  }));
}

const { post } = Astro.props;
const { Content } = await post.render();
---

<Layout title={post.data.title} description={post.data.description}>
  <article>
    <header>
      <h1>{post.data.title}</h1>
      <p>{post.data.description}</p>
      <time>{post.data.publishDate.toLocaleDateString()}</time>
      <div class="tags">
        {post.data.tags.map(tag => (
          <span class="tag">{tag}</span>
        ))}
      </div>
    </header>
    
    <Content />
  </article>
</Layout>

Working with Images

Image Schema

const blog = defineCollection({
  type: 'content',
  schema: ({ image }) => z.object({
    title: z.string(),
    heroImage: image(),
    gallery: z.array(image()).optional(),
    author: z.object({
      name: z.string(),
      avatar: image(),
    }),
  }),
});

Using Images in Content

---
import { Image } from 'astro:assets';
import { getEntry } from 'astro:content';

const post = await getEntry('blog', 'my-post');
---

<Layout>
  <article>
    <!-- Hero image -->
    <Image
      src={post.data.heroImage}
      alt={post.data.title}
      width={800}
      height={400}
    />
    
    <!-- Author avatar -->
    <div class="author">
      <Image
        src={post.data.author.avatar}
        alt={post.data.author.name}
        width={50}
        height={50}
      />
      <span>{post.data.author.name}</span>
    </div>
    
    <!-- Gallery -->
    {post.data.gallery && (
      <div class="gallery">
        {post.data.gallery.map(image => (
          <Image src={image} alt="" width={300} height={200} />
        ))}
      </div>
    )}
  </article>
</Layout>

Content Relationships

// src/utils/content.ts
import { getCollection } from 'astro:content';

export async function getRelatedPosts(currentPost, limit = 3) {
  const allPosts = await getCollection('blog');
  
  // Filter out current post and drafts
  const otherPosts = allPosts.filter(
    post => post.slug !== currentPost.slug && !post.data.draft
  );
  
  // Calculate similarity based on tags
  const postsWithScore = otherPosts.map(post => {
    const commonTags = post.data.tags.filter(tag => 
      currentPost.data.tags.includes(tag)
    );
    const score = commonTags.length;
    
    return { post, score };
  });
  
  // Sort by score and return top results
  return postsWithScore
    .sort((a, b) => b.score - a.score)
    .slice(0, limit)
    .map(item => item.post);
}

Series and Collections

// src/utils/series.ts
import { getCollection } from 'astro:content';

export async function getSeriesPosts(seriesName) {
  const allPosts = await getCollection('blog');
  
  return allPosts
    .filter(post => post.data.series === seriesName && !post.data.draft)
    .sort((a, b) => a.data.publishDate.valueOf() - b.data.publishDate.valueOf());
}

export async function getPostNavigation(currentPost) {
  if (!currentPost.data.series) return { prev: null, next: null };
  
  const seriesPosts = await getSeriesPosts(currentPost.data.series);
  const currentIndex = seriesPosts.findIndex(post => post.slug === currentPost.slug);
  
  return {
    prev: currentIndex > 0 ? seriesPosts[currentIndex - 1] : null,
    next: currentIndex < seriesPosts.length - 1 ? seriesPosts[currentIndex + 1] : null,
  };
}

Content Transformations

Custom Markdown Processing

// astro.config.mjs
import { defineConfig } from 'astro/config';
import remarkToc from 'remark-toc';
import remarkMath from 'remark-math';
import rehypeKatex from 'rehype-katex';

export default defineConfig({
  markdown: {
    remarkPlugins: [
      remarkToc,
      remarkMath,
    ],
    rehypePlugins: [
      rehypeKatex,
    ],
    shikiConfig: {
      theme: 'github-dark',
      langs: ['js', 'ts', 'astro', 'md'],
    },
  },
});

Content Processing

// src/utils/content-processing.ts
export function calculateReadingTime(content: string): number {
  const wordsPerMinute = 200;
  const words = content.split(/\s+/).length;
  return Math.ceil(words / wordsPerMinute);
}

export function extractExcerpt(content: string, length = 160): string {
  // Remove markdown syntax
  const plainText = content
    .replace(/#{1,6}\s+/g, '') // Headers
    .replace(/\*\*(.*?)\*\*/g, '$1') // Bold
    .replace(/\*(.*?)\*/g, '$1') // Italic
    .replace(/\[(.*?)\]\(.*?\)/g, '$1') // Links
    .replace(/```[\s\S]*?```/g, '') // Code blocks
    .replace(/`(.*?)`/g, '$1'); // Inline code
  
  return plainText.length > length 
    ? plainText.substring(0, length) + '...'
    : plainText;
}

export function generateSlug(title: string): string {
  return title
    .toLowerCase()
    .replace(/[^a-z0-9\s-]/g, '')
    .replace(/\s+/g, '-')
    .replace(/-+/g, '-')
    .trim();
}

RSS and Sitemaps

RSS Feed Generation

// src/pages/rss.xml.ts
import rss from '@astrojs/rss';
import { getCollection } from 'astro:content';

export async function GET(context) {
  const posts = await getCollection('blog');
  const publishedPosts = posts.filter(post => !post.data.draft);

  return rss({
    title: 'My Blog',
    description: 'A blog about web development',
    site: context.site,
    items: publishedPosts.map(post => ({
      title: post.data.title,
      pubDate: post.data.publishDate,
      description: post.data.description,
      link: `/blog/${post.slug}/`,
      categories: post.data.tags,
      author: post.data.author.email,
    })),
    customData: `<language>en-us</language>`,
  });
}

Dynamic Sitemap

// src/pages/sitemap.xml.ts
import type { APIRoute } from 'astro';
import { getCollection } from 'astro:content';

export const GET: APIRoute = async ({ site }) => {
  const posts = await getCollection('blog');
  const publishedPosts = posts.filter(post => !post.data.draft);

  const staticPages = [
    '',
    '/about',
    '/contact',
    '/blog',
  ];

  const blogPages = publishedPosts.map(post => `/blog/${post.slug}`);
  
  const allPages = [...staticPages, ...blogPages];

  const sitemap = `<?xml version="1.0" encoding="UTF-8"?>
    <urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
      ${allPages.map(page => `
        <url>
          <loc>${site}${page}</loc>
          <changefreq>weekly</changefreq>
          <priority>0.8</priority>
        </url>
      `).join('')}
    </urlset>`;

  return new Response(sitemap, {
    headers: {
      'Content-Type': 'application/xml'
    }
  });
};

Performance Optimization

Pagination

---
// src/pages/blog/[...page].astro
import { getCollection } from 'astro:content';
import Layout from '../../layouts/Layout.astro';

export async function getStaticPaths({ paginate }) {
  const posts = await getCollection('blog');
  const publishedPosts = posts
    .filter(post => !post.data.draft)
    .sort((a, b) => b.data.publishDate.valueOf() - a.data.publishDate.valueOf());

  return paginate(publishedPosts, { pageSize: 10 });
}

const { page } = Astro.props;
---

<Layout title="Blog">
  <h1>Blog Posts</h1>
  
  {page.data.map(post => (
    <article>
      <h2><a href={`/blog/${post.slug}`}>{post.data.title}</a></h2>
      <p>{post.data.description}</p>
    </article>
  ))}
  
  <!-- Pagination -->
  <nav>
    {page.url.prev && <a href={page.url.prev}>Previous</a>}
    <span>Page {page.currentPage} of {page.lastPage}</span>
    {page.url.next && <a href={page.url.next}>Next</a>}
  </nav>
</Layout>

Lazy Loading Content

---
// Load only essential content initially
const recentPosts = await getCollection('blog', ({ data }) => {
  const oneWeekAgo = new Date();
  oneWeekAgo.setDate(oneWeekAgo.getDate() - 7);
  return data.publishDate >= oneWeekAgo && !data.draft;
});
---

<Layout>
  <!-- Recent posts load immediately -->
  <section>
    <h2>Recent Posts</h2>
    {recentPosts.map(post => (
      <PostCard post={post} />
    ))}
  </section>
  
  <!-- Archive loads on demand -->
  <ArchiveSection client:visible />
</Layout>

Testing Content Collections

Schema Validation Tests

// tests/content-schema.test.ts
import { z } from 'zod';
import { blogSchema } from '../src/content/config';

describe('Blog Schema', () => {
  test('validates correct blog post data', () => {
    const validPost = {
      title: 'Test Post',
      description: 'A test post',
      publishDate: new Date(),
      author: 'Test Author',
      tags: ['test'],
      featured: false,
      draft: false,
    };

    expect(() => blogSchema.parse(validPost)).not.toThrow();
  });

  test('rejects invalid blog post data', () => {
    const invalidPost = {
      title: '', // Empty title should fail
      description: 'A test post',
      publishDate: 'invalid-date', // Invalid date
      author: 'Test Author',
      tags: [], // Empty tags array should fail
    };

    expect(() => blogSchema.parse(invalidPost)).toThrow();
  });
});

Content Query Tests

// tests/content-queries.test.ts
import { getCollection } from 'astro:content';

describe('Content Queries', () => {
  test('filters draft posts correctly', async () => {
    const allPosts = await getCollection('blog');
    const publishedPosts = allPosts.filter(post => !post.data.draft);
    
    expect(publishedPosts.every(post => !post.data.draft)).toBe(true);
  });

  test('sorts posts by date correctly', async () => {
    const posts = await getCollection('blog');
    const sortedPosts = posts.sort(
      (a, b) => b.data.publishDate.valueOf() - a.data.publishDate.valueOf()
    );
    
    for (let i = 1; i < sortedPosts.length; i++) {
      expect(sortedPosts[i-1].data.publishDate.valueOf())
        .toBeGreaterThanOrEqual(sortedPosts[i].data.publishDate.valueOf());
    }
  });
});

Best Practices

Content Organization

  1. Consistent Naming: Use kebab-case for file names and slugs
  2. Logical Structure: Group related content in subdirectories
  3. Asset Management: Keep images close to content files
  4. Schema Evolution: Plan for schema changes and migrations

Performance

  1. Selective Loading: Only load content you need
  2. Pagination: Implement pagination for large collections
  3. Caching: Cache expensive content operations
  4. Image Optimization: Use Astro’s Image component

Developer Experience

  1. Type Safety: Leverage TypeScript for better DX
  2. Validation: Use comprehensive schemas
  3. Documentation: Document your content structure
  4. Testing: Test content queries and transformations

Conclusion

Content Collections provide a powerful foundation for building content-driven websites with Astro. By leveraging type safety, validation, and excellent developer tooling, you can create maintainable and scalable content management systems.

Key takeaways:

  • Define comprehensive schemas for type safety
  • Use advanced querying for dynamic content
  • Implement proper content relationships
  • Optimize performance with pagination and lazy loading
  • Test your content structure and queries
  • Follow best practices for organization and maintenance

Content Collections make it easy to build everything from simple blogs to complex documentation sites and e-commerce catalogs with confidence and excellent performance.