The Two-column Resume and ATS systems: The Challenges of Parsing a PDF File

mins read

What is an ATS ?

An ATs or Applicant Tracking System is a sofwtare used by organizations to manage their recruitmeent processes.

It handles all the workflow from resume collection to job postings and communications with applicants.

One aspect that often concerns job seekers is the resume parsing feature, which pre-screens candidates. Many applicants worry that their resumes might be rejected if the formatting is not properly recognized by the ATS.

Can my resume layout interfere with ATS ?

Well it is actually difficult to know. ATS are closed systems and their codebase is not public, making it challenging to determine if a resume's format or layout affects its ranking. Though it would be very damaging for a company to reject a perfectly suited resume in case it is not parsed correctly by the system..

So should I do something in particular ? Maybe keep my resume in a single column ?

The most basic ATS parsing tool looks for specific keywords in a resume. To be honest, unless your resume is a plain text file, many factors can alter the ATS software's ability to parse it properly. And this is true even if your resume is made of a single column. For example, extra letter spacing in sections titles can prevent the ATS recognizing these titles as words.

But let's do a quick test

Our test

In order to have a better idea if a two column PDF resume is difficlut to parse, we have made a quick test with various PDF parsers.

We will submit a mock PDF resume to each of these parsers and check if the resulting plain text :

  • has every word from the original resume and doesn't have extra content than the original resume.
  • some words don't end up in different resume sections than their original section (for example a certification ending in the languages section).

The PDF resume we are using for the test


The pdftotext utility in Ubuntu / Linux relies on the Poppler library, which is an open-source PDF rendering library. Poppler itself is based on Xpdf, an older PDF viewer and library.

It is an old library yet very popular and super fast.

Senior Engineer
Google - San Francisco

June 2019 - Now

• Led a team to overhaul the backend messaging system based on


Scala Software
Address :
Crocker, San Francisco, USA
Phone :
+1 (555) 555 5555
Email :

Japanese Started learning
in 2020



Apache Kafka. This involved developing new components and
architectures, optimizing existing systems, and ensuring that the
system is secure and performant.
• Responsible for designing and developing data pipelines and APIs,
maintaining uptime and scalability, and providing technical guidance
and support to other teams.

Engineer - Scala & Kafka

January 2015 - May 2019

• Developed and deployed microservices using Scala , Go in a team
of 4 coders
• Analyzed the existing systems and identified areas of improvement
• Developed and deployed high-availability and fault-tolerant services
• Collaborated with other teams to ensure smooth delivery of
services, and adapt microservices REST routes

Backend Engineer

Jan 2011 - Nov 2014

• Leveraged Google Cloud Platform to manage a Kafka cluster
deployed in Kubernetes and managed various aspects of the cluster,
such as scaling, security, and performance.
• Developed custom connectors using Kafka Connect API to integrate
with other Google services such as BigQuery and Dataproc.

Freelance Ruby On Rails coder
Self employed

Dec 2006 - Aug 2010

For various clients :
• Developed and maintained Ruby on Rails web applications to
enable users to store and access data on demand
• Designed and implemented a RESTful API utilizing Ruby on Rails
and Postgresql to integrate with existing systems
• Leveraged TDD to ensure the quality and correctness of the code
• Created custom rake tasks to automate routine tasks
• Collaborated with other developers to ensure the application was
up to industry standards

Apache Kafka



BS in Computer Science

Ruby on Rails

UC Berkeley - Berkeley

Mongo DB

2001 - 2005


As you can see :

  • Every word is present
  • The content of the two columns is mixed up. As you can see the resume starts with Experience. It has parsed the main column first. But from the first line of the applicant most recent work expeirence switches to the side column data.

Tool 2: Apache Tika 2.6.0

Apache Tika is a content analysis toolkit that detects and extracts metadata and text from various document types (not only PDF).


Crocker, San Francisco, USA

+1 (555) 555 5555

English Native  
Japanese Started learning

in 2020


Google - San Francisco



Self employed

Scala Software


Address :

Phone :

Email :




Apache Kafka
Ruby on Rails
Mongo DB


Senior Engineer
June 2019 - Now

• Led a team to overhaul the backend messaging system based on
Apache Kafka. This involved developing new components and
architectures, optimizing existing systems, and ensuring that the
system is secure and performant. 
• Responsible for designing and developing data pipelines and APIs,
maintaining uptime and scalability, and providing technical guidance
and support to other teams.

Engineer - Scala & Kafka
January 2015 - May 2019

• Developed and deployed microservices using Scala , Go in a team
of 4 coders
• Analyzed the existing systems and identified areas of improvement
• Developed and deployed high-availability and fault-tolerant services
• Collaborated with other teams to ensure smooth delivery of
services, and adapt microservices REST routes

Backend Engineer
Jan 2011 - Nov 2014

• Leveraged Google Cloud Platform to manage a Kafka cluster
deployed in Kubernetes and managed various aspects of the cluster,
such as scaling, security, and performance.
• Developed custom connectors using Kafka Connect API to integrate
with other Google services such as BigQuery and Dataproc. 

Freelance Ruby On Rails coder
Dec 2006 - Aug 2010

For various clients :
• Developed and maintained Ruby on Rails web applications to
enable users to store and access data on demand
• Designed and implemented a RESTful API utilizing Ruby on Rails
and Postgresql to integrate with existing systems
• Leveraged TDD to ensure the quality and correctness of the code
• Created custom rake tasks to automate routine tasks
• Collaborated with other developers to ensure the application was
up to industry standards


BS in Computer Science
UC Berkeley - Berkeley 2001 - 2005

Here with Tika:

  • Every word is present in the final text
  • The content of the two columns is mixed up. Though it isq slightly better than with the Poppler library. The parser starts by showing the data from the side column. It also parses data that is hidden in the Poppler library such as email address or links' Urls. And Experience and Education sections content is not mixed up with another section.

Apache tika is giving better results than Poppler especially considering the main sections Experience and Education where content is not mixed up with another column and their content reproduced faithfully. In case an ATS is using a semantic analyzer it is more likely to produce good results.

Tool 3: PyPDF2

PyPDF2 is also a popular tool but is giving mixed results on our resume. Experience section content is preserved yet it seems to have difficulties parsing sections with shorter contents such as lists. Also it is not respecting the flow as good as previous tools. Exerpt :

Crocker, San Francisco, USA
+1 (555) 555 5555
English Native  
JapaneseStarted learning
in 2020 
GithubGoogle -San F rancisco
Self emplo yed
Scala Softwar e
Address :
Phone :
Email :

Tool 4: Parsr

Parsr is a tool based on PDFMiner and uses some OCR tools on top of it, such as Tesseract. It is developed by french company AXA and can be found here:

It is a modular and highly customizable tool making it slightly more difficult and probably less versatile than other tools tested above. It was difficult to get consistent results for our resume and the best results we could obtain have not exceeded TIKA or PDFTOTEXT.


Parsing PDFs is notoriously tricky. Most parsers struggle to interpret documents like a human would. However, three out of four parsers can still display every word as it appears on the original resume. Impressively, all words from the Experience and Education sections are faithfully reproduced, and one parser, Tika, even manages to capture both sections in full.

With advanced tools like the ADOBE API and recent breakthroughs in Artificial Intelligence, I'm confident that soon, any resume will be parsed accurately and without errors.

At CVdunk, we're all in on two-column resumes! This format fills some gap, offering unique advantages and potentially being the perfect fit for many candidates. Create your resume today with CVdunk and get the best of our single and two-columns resumes templates!