First steps to get ATS-Friendly CVs: Automating Professional Resumes with YAML, HTML and Python

There are currently some issues with the code highlighter

Author:

Francisco

Language:

Date:

2024-12-01

Description:

Some steps to get an ATS friendly flexible CV

This article contains some technical recommendations, but is not fully intended as a tutorial.

Some of the dependencies here might not work on every OS.

Be aware to store and process your (or any personal data you have access to) in a safe and secured environment.

Motivation

Last week I spent some time improving my CV. On the motto better late than never, I decided to jump on the ATS-friendly CV bandwagon and started to experiment with Markdown.

Going back in time, my first CVs were of course in MS Word. By the time of writing my bachelor's thesis (2010), in technical academia I was exposed to LaTeX. After spending a lot of time in tweaking the template and the formatting I got a beautiful and visually appealing CV I was very proud of. But here trouble began. Applicant Tracking Systems (ATS) often struggle with processing complex layouts effectively. Every big-company system that allowed for import just didn't parse my LaTeX CVs correctly, Characters, sections and layout just got messed up. Every time I had to retype the complete information into the system. So I was back to Word for the next decade and half.

Overall Word is still the most recommended and ATS-friendly format, but with my technical savviness I wanted a configurable CV.

Solution strategy

Markdown (MD), AsciiDoc (ADOC) and Restructured Text (RST) are the backbone of technical documentation. These formats are simple (or lightweight) markup languages built on top of ASCII-text files. They are the skinnier cousins of plain HTML and may have some application-specific extensions and features.

Together with a templating engine like Jinja2 or Nunjucks we can use a structured data source (YAML, JSON, XML, etc.) and generate a custom markdown document which together with an markdown to HTML generator (and a style sheet) can generate a document that can be printed to PDF.

However, Markdown often falls short when more flexible and complex layout styles are required. Soon your Markdown will be full of HTML tags anyway, so after trying out several versions with a YAML-Markdown-HTML-PDF pipeline I decided to ditch the Markdown step.

As a result, the proposed solution strategy is as follows:

Create a Data Model
Create a HTML Jinja2 template
Generate HTML
Print PDF

The Data Model

The first step is to describe your data model using YAML or JSON. In this example, I have selected YAML to implement it. After setting the necessary elements that describe the date used in a CV (personal data, skills, experience, education), I can create different variants depending on the target positions.

The following YAML example (cv_data.yaml) demonstrates a basic structure:

---
name: John Doe
title: Software Developer
specialities:
  - Full-stack Web Development
contact:
  email: john.doe@example.com
  phone: 123456789
  address: New Web City
skills:
  - Python
  - JavaScript
  - HTML/CSS
  - C
summary: Experienced full-stack developer for fast-loading multi-content web
  sites written in C
experience:
  - company: Tech Solutions Inc.
    role: Senior Developer
    period:
      start: Jan 2020
      end: Present
    description: Developed tech stack to serve 100M requests per day with backend
      developed in C.
  - company: Technocorp S.A.
    role: Medior Developer
    period:
      start: Jul 2018
      end: Dec 2019
    description: Backend development for load-balancing request architecture
  - company: Webensa Corp
    role: Junior Developer
    period:
      start: Jan 2017
      end: Jul 2018
    description: Frontend JS development
education:
  - institution: High Technical University, Capitol City
    degree: BSc. Computer Science
    period end: Dec 2016
languages:
  - language: German
    level: Basic proficiency
  - language: Spanish
    level: Full professional
  - language: English
    level: Native
certifications: 
  - name: "Certification as JS programming person - Foundation Level"
    short_name: CAJSPP-FL
    url: "http://www.certificate.com/uuid"
skills:
  - "JavaScript":
    - "Angular.js"
    - "Node.js"
  - "C":
    - "picohttp"
  - Programming Languages: 
    - C
    - JavaScript
    - Cobol
    - FORTRAN
    - Python

Practical Tip: Always validate or lint your YAML file to ensure correctness.

Advanced tip: You can create more advanced schema-based validation for your data model. This is likely overkill for most applications.

The Template

I chose Jinja2 since it is the standard in the Python world.

Templating allows us to dynamically fill-in the data we defined in the previous step and maintains a very flexible approach. We can generate several CVs by customizing either the input data model or by customizing the template.

Here as an example (cv_template.html.j2):

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <meta name="viewport" content="width=device-width, initial-scale=1.0">
    <title>{{ name }} - {{ title }}</title>
    <style>
        body {
            font-family: Arial, sans-serif;
            line-height: 1.6;
            max-width: 800px;
            margin: 0 auto;
            padding: 20px;
        }
        section {
            margin-bottom: 2em;
        }
        h1, h2 {
            color: #333;
            margin-bottom: 0.5em;
        }
        .contact-info {
            margin-bottom: 1.5em;
        }
        .experience-item, .education-item {
            margin-bottom: 1.5em;
        }
        .role, .company {
            font-weight: bold;
        }
        .period {
            color: #666;
        }
        .skills-list {
            display: flex;
            flex-wrap: wrap;
            gap: 0.5em;
        }
        .skill-item {
            background: #f5f5f5;
            padding: 0.3em 0.6em;
            border-radius: 3px;
        }
    </style>
</head>
<body>
    <header>
        <h1>{{ name }}</h1>
        <h2>{{ title }}</h2>
        
        <section class="contact-info">
            <p>
                {% if contact.email %}Email: {{ contact.email }}{% endif %}
                {% if contact.phone %} | Phone: {{ contact.phone }}{% endif %}
                {% if contact.address %} | Location: {{ contact.address }}{% endif %}
            </p>
        </section>
    </header>

    {% if summary %}
    <section>
        <h2>Professional Summary</h2>
        <p>{{ summary }}</p>
    </section>
    {% endif %}

    {% if specialities %}
    <section>
        <h2>Specialities</h2>
        <ul>
            {% for speciality in specialities %}
            <li>{{ speciality }}</li>
            {% endfor %}
        </ul>
    </section>
    {% endif %}

    {% if experience %}
    <section>
        <h2>Professional Experience</h2>
        {% for job in experience %}
        <div class="experience-item">
            <p>
                <span class="role">{{ job.role }}</span> at 
                <span class="company">{{ job.company }}</span>
            </p>
            <p class="period">{{ job.period.start }} - {{ job.period.end }}</p>
            <p>{{ job.description }}</p>
        </div>
        {% endfor %}
    </section>
    {% endif %}

    {% if education %}
    <section>
        <h2>Education</h2>
        {% for edu in education %}
        <div class="education-item">
            <p>
                <span class="degree">{{ edu.degree }}</span>
                <br>{{ edu.institution }}
                {% if edu["period end"] %}<br>Completed: {{ edu["period end"] }}{% endif %}
            </p>
        </div>
        {% endfor %}
    </section>
    {% endif %}

    {% if languages %}
    <section>
        <h2>Languages</h2>
        <ul>
            {% for lang in languages %}
            <li>{{ lang.language }}: {{ lang.level }}</li>
            {% endfor %}
        </ul>
    </section>
    {% endif %}

    {% if certifications %}
    <section>
        <h2>Certifications</h2>
        <ul>
            {% for cert in certifications %}
            <li>
                {{ cert.name }}
                {% if cert.url %}
                ({{ cert.url }})
                {% endif %}
            </li>
            {% endfor %}
        </ul>
    </section>
    {% endif %}

    {% if skills %}
    <section>
        <h2>Technical Skills</h2>
        {% for skill in skills %}
            {% if skill is mapping %}
                {% for category, items in skill.items() %}
                <h3>{{ category }}</h3>
                <ul>
                    {% for item in items %}
                    <li>{{ item }}</li>
                    {% endfor %}
                </ul>
                {% endfor %}
            {% else %}
                <div class="skill-item">{{ skill }}</div>
            {% endif %}
        {% endfor %}
    </section>
    {% endif %}
</body>
</html>

Rendering the HTML

The next step is to render the HTML.

Following can be tried out in a Jupyter notebook or directly as a script.

from jinja2 import Environment, FileSystemLoader
import yaml

# Load YAML data
with open('cv_data.yaml', 'r') as file:
    cv_data = yaml.safe_load(file)

# Set up Jinja2 environment
env = Environment(loader=FileSystemLoader('.'))
template = env.get_template('cv_template.html.j2')

# Render the template
output = template.render(**cv_data)

# Save the output
with open('cv.html', 'w') as file:
    file.write(output)

However this is just the first HTML rendering.

Print to PDF

The final step is to refine the script to also print to PDF. For this we need to also include the proper page sizes (A4 for European outputs).

Just take into account that weasyprint might not work in your OS.

import yaml
from jinja2 import Environment, FileSystemLoader
from weasyprint import HTML, CSS
from weasyprint.text.fonts import FontConfiguration
import os

def load_yaml(file_path):
    """Load YAML data from file."""
    with open(file_path, 'r', encoding='utf-8') as file:
        return yaml.safe_load(file)

def create_output_dir(dir_name='output'):
    """Create output directory if it doesn't exist."""
    if not os.path.exists(dir_name):
        os.makedirs(dir_name)
    return dir_name

def generate_cv(yaml_path, template_path, output_dir='output'):
    """Generate CV in HTML and PDF formats."""
    # Load data
    cv_data = load_yaml(yaml_path)
    
    # Set up Jinja2 environment
    env = Environment(loader=FileSystemLoader(os.path.dirname(template_path)))
    template = env.get_template(os.path.basename(template_path))
    
    # Create output directory
    output_dir = create_output_dir(output_dir)
    
    # Generate HTML
    html_output = template.render(**cv_data)
    html_path = os.path.join(output_dir, 'cv.html')
    with open(html_path, 'w', encoding='utf-8') as file:
        file.write(html_output)
    
    # Configure WeasyPrint
    font_config = FontConfiguration()
    
    # Additional CSS for PDF
    pdf_css = CSS(string='''
        @page {
            margin: 1cm;
            size: A4;
            @top-right {
                content: "Page " counter(page) " of " counter(pages);
            }
        }
        body {
            font-size: 11pt;
        }
        h1 { font-size: 18pt; }
        h2 { font-size: 14pt; }
        h3 { font-size: 12pt; }
    ''', font_config=font_config)
    
    # Generate PDF
    HTML(html_path).write_pdf(
        os.path.join(output_dir, 'cv.pdf'),
        stylesheets=[pdf_css],
        font_config=font_config
    )
    
    print(f"Generated CV in {output_dir}:")
    print(f"- HTML: {os.path.join(output_dir, 'cv.html')}")
    print(f"- PDF: {os.path.join(output_dir, 'cv.pdf')}")

def main():
    # File paths
    yaml_path = 'cv_data.yaml'
    template_path = 'cv_template.html.j2'
    
    try:
        generate_cv(yaml_path, template_path)
    except Exception as e:
        print(f"Error generating CV: {str(e)}")
        raise

if __name__ == '__main__':
    main()

Example CV

Further tweaking

Now of course there is some further tweaking to do. For automation and repeatability, you can use argparse to make a CLI application which takes YAML files and HTML templates.

Metadata

The use of metadata in ATS seems to be quite contradictory. In general, use just title and description. Keywords are likely to be ignored if they are not part of the main text.

Verification

You can use pdfplumber and other PDF to text extractors to test the output of the PDF printer process.

import pdfplumber

text_content = []

with pdfplumber.open("output/cv.pdf") as pdf:
    
    for page in pdf.pages:
        text = page.extract_text()
        if text:
            text_content.append(text)

text_content = '\n'.join(text_content)

print(text_content)

Here we get the text output:

John Doe
Software Developer
Email: john.doe@example.com | Phone: 123456789 | Location: New Web City
Professional Summary
Experienced full-stack developer for fast-loading multi-content web sites written in C
Specialities
• Full-stack Web Development
Professional Experience
Senior Developer at Tech Solutions Inc.
Jan 2020 - Present
Developed tech stack to serve 100M requests per day with backend developed in C.
Medior Developer at Technocorp S.A.
Jul 2018 - Dec 2019
Backend development for load-balancing request architecture
Junior Developer at Webensa Corp
Jan 2017 - Jul 2018
Frontend JS development
Education
BSc. Computer Science
High Technical University, Capitol City
Completed: Dec 2016
Languages
• German: Basic proficiency
• Spanish: Full professional
• English: Native
...
• JavaScript
• Cobol
• FORTRAN
• Python

It is important to verify that the output is parsed and extracted correctly

Conclusion

While Markdown seemed promising initially, the direct YAML with templated HTML/CSS approach has proven more maintainable and reliable. This approach provides enhanced control over presentation, enabling the creation of ATS-friendly CVs that are both professional and visually appealing.

The combination of YAML for data, Jinja2 for templates, and WeasyPrint for PDF generation provides a robust pipeline that can grow with your needs. Whether you're maintaining multiple CV versions or need precise control over the layout, this approach gives you the flexibility without the limitations of Markdown.

It is important to note that a CV often serves as the first impression for potential employers. Having a reliable, maintainable generation process lets you focus on the content rather than fighting with formatting issues.

Also this approach allows for complete flexibility in tailoring your CV for the target position.

AI Disclaimers

The author used several AI tools and models to improve and review the content of this article like Claude.ai and ChatGPT.

Return to Blog