Build a Resume Parser in 10 Lines of Code

By RamLabs Team · April 2026 · 4 min read

If you're building an ATS, HR tool, or recruiting platform, you need to parse resumes. Candidates upload PDFs in hundreds of different formats — and your system needs structured data.

Here's how to build a production-ready resume parser in 10 lines of Python.

The Code

import base64, requests

def parse_resume(file_path):
    with open(file_path, "rb") as f:
        content = base64.b64encode(f.read()).decode()

    resp = requests.post("https://api.ramlabs.dev/v1/extract",
        headers={"Authorization": "Bearer YOUR_API_KEY"},
        json={"document": content, "documentType": "pdf", "schema": "resume"})

    return resp.json()["data"]

# Parse a resume
candidate = parse_resume("sarah_chen_resume.pdf")
print(f"Name: {candidate['name']['value']}")
print(f"Email: {candidate['email']['value']}")
print(f"Skills: {', '.join(candidate['skills']['value'])}")

That's it. No NLP libraries, no training data, no regex.

What Gets Extracted

{
  "name": { "value": "Sarah Chen", "confidence": 0.99 },
  "email": { "value": "sarah.chen@email.com", "confidence": 0.99 },
  "phone": { "value": "(415) 555-0142", "confidence": 0.97 },
  "location": { "value": "San Francisco, CA", "confidence": 0.95 },
  "experience": {
    "value": [
      {
        "title": "Senior Software Engineer",
        "company": "Stripe",
        "period": "2021 - Present",
        "highlights": [
          "Architected payment processing pipeline handling 10M+ transactions/day",
          "Led migration from monolith to microservices"
        ]
      }
    ],
    "confidence": 0.94
  },
  "education": {
    "value": [
      { "degree": "B.S. Computer Science", "institution": "Stanford University", "year": "2016" }
    ],
    "confidence": 0.96
  },
  "skills": {
    "value": ["TypeScript", "Python", "Go", "React", "Node.js", "AWS", "Kubernetes"],
    "confidence": 0.93
  }
}

Traditional vs ScoutExtract

Traditional approach

Install NLP libraries (spaCy, NLTK)
Build text extraction pipeline
Train NER model
Write regex for phones, emails, dates
Handle edge cases forever

Time: 2-4 weeks
Accuracy: 70-85%

ScoutExtract approach

Sign up for API key
Send resume + schema
Get structured JSON

Time: 15 minutes
Accuracy: 90-98%

Batch Processing

import os, json

results = []
for filename in os.listdir("./resumes/"):
    if filename.endswith((".pdf", ".png", ".jpg")):
        data = parse_resume(f"./resumes/{filename}")
        results.append({
            "file": filename,
            "name": data["name"]["value"],
            "email": data["email"]["value"],
            "skills": data["skills"]["value"],
        })
        print(f"Parsed: {data['name']['value']} - {len(data['skills']['value'])} skills")

with open("parsed_candidates.json", "w") as f:
    json.dump(results, f, indent=2)

Smart Candidate Screening

def screen_candidate(data):
    skills = [s.lower() for s in data["skills"]["value"]]
    required = ["python", "react", "aws"]
    matched = [s for s in required if s in skills]

    score = len(matched) * 20
    if len(data["experience"]["value"]) >= 3:
        score += 20

    avg_confidence = sum(
        data[f]["confidence"] for f in ["name", "skills", "experience"]
    ) / 3

    return {
        "score": score,
        "matched_skills": matched,
        "confidence": avg_confidence,
        "auto_qualify": score >= 60 and avg_confidence > 0.85
    }

Parse Your First Resume

25 free extractions/month. No credit card required.

Get Your API Key →