What is an ATS ?
An ATs or Applicant Tracking System is a sofwtare used by organizations to manage their recruitmeent processes.
It handles all the workflow from resume collection to job postings and communications with applicants.
One aspect that often concerns job seekers is the resume parsing feature, which pre-screens candidates. Many applicants worry that their resumes might be rejected if the formatting is not properly recognized by the ATS.
Can my resume layout interfere with ATS ?
Well it is actually difficult to know. ATS are closed systems and their codebase is not public, making it challenging to determine if a resume's format or layout affects its ranking. Though it would be very damaging for a company to reject a perfectly suited resume in case it is not parsed correctly by the system..
So should I do something in particular ? Maybe keep my resume in a single column ?
The most basic ATS parsing tool looks for specific keywords in a resume. To be honest, unless your resume is a plain text file, many factors can alter the ATS software's ability to parse it properly. And this is true even if your resume is made of a single column. For example, extra letter spacing in sections titles can prevent the ATS recognizing these titles as words.
But let's do a quick test
Our test
In order to have a better idea if a two column PDF resume is difficlut to parse, we have made a quick test with various PDF parsers.
We will submit a mock PDF resume to each of these parsers and check if the resulting plain text :
- has every word from the original resume and doesn't have extra content than the original resume.
- some words don't end up in different resume sections than their original section (for example a certification ending in the languages section).
The PDF resume we are using for the test
Tool 1: PDFTOTEXT
The pdftotext utility in Ubuntu / Linux relies on the Poppler library, which is an open-source PDF rendering library. Poppler itself is based on Xpdf, an older PDF viewer and library.
It is an old library yet very popular and super fast.
Experience Senior Engineer Google - San Francisco June 2019 - Now • Led a team to overhaul the backend messaging system based on ARTHUR WALKER Scala Software Engineer Contact Address : Crocker, San Francisco, USA Phone : +1 (555) 555 5555 Email : arthur.walker@dummym.com Languages English Native Japanese Started learning in 2020 Links LinkedIn Github Skills Apache Kafka. This involved developing new components and architectures, optimizing existing systems, and ensuring that the system is secure and performant. • Responsible for designing and developing data pipelines and APIs, maintaining uptime and scalability, and providing technical guidance and support to other teams. Engineer - Scala & Kafka Twitter January 2015 - May 2019 • Developed and deployed microservices using Scala , Go in a team of 4 coders • Analyzed the existing systems and identified areas of improvement • Developed and deployed high-availability and fault-tolerant services • Collaborated with other teams to ensure smooth delivery of services, and adapt microservices REST routes Backend Engineer Freshworks Jan 2011 - Nov 2014 • Leveraged Google Cloud Platform to manage a Kafka cluster deployed in Kubernetes and managed various aspects of the cluster, such as scaling, security, and performance. • Developed custom connectors using Kafka Connect API to integrate with other Google services such as BigQuery and Dataproc. Freelance Ruby On Rails coder Self employed Dec 2006 - Aug 2010 For various clients : • Developed and maintained Ruby on Rails web applications to enable users to store and access data on demand • Designed and implemented a RESTful API utilizing Ruby on Rails and Postgresql to integrate with existing systems • Leveraged TDD to ensure the quality and correctness of the code • Created custom rake tasks to automate routine tasks • Collaborated with other developers to ensure the application was up to industry standards Scala Apache Kafka Education Java Golang BS in Computer Science Ruby on Rails UC Berkeley - Berkeley Mongo DB HTML5 / CSS3 2001 - 2005
As you can see :
- Every word is present
- The content of the two columns is mixed up. As you can see the resume starts with Experience. It has parsed the main column first. But from the first line of the applicant most recent work expeirence switches to the side column data.
Tool 2: Apache Tika 2.6.0
Apache Tika is a content analysis toolkit that detects and extracts metadata and text from various document types (not only PDF).
ARTHUR WALKER Crocker, San Francisco, USA +1 (555) 555 5555 arthur.walker@dummym.com English Native Japanese Started learning in 2020 LinkedIn Github Google - San Francisco Twitter Freshworks Self employed Scala Software Engineer Contact Address : Phone : Email : Languages Links Skills Scala Apache Kafka Java Golang Ruby on Rails Mongo DB HTML5 / CSS3 Experience Senior Engineer June 2019 - Now • Led a team to overhaul the backend messaging system based on Apache Kafka. This involved developing new components and architectures, optimizing existing systems, and ensuring that the system is secure and performant. • Responsible for designing and developing data pipelines and APIs, maintaining uptime and scalability, and providing technical guidance and support to other teams. Engineer - Scala & Kafka January 2015 - May 2019 • Developed and deployed microservices using Scala , Go in a team of 4 coders • Analyzed the existing systems and identified areas of improvement • Developed and deployed high-availability and fault-tolerant services • Collaborated with other teams to ensure smooth delivery of services, and adapt microservices REST routes Backend Engineer Jan 2011 - Nov 2014 • Leveraged Google Cloud Platform to manage a Kafka cluster deployed in Kubernetes and managed various aspects of the cluster, such as scaling, security, and performance. • Developed custom connectors using Kafka Connect API to integrate with other Google services such as BigQuery and Dataproc. Freelance Ruby On Rails coder Dec 2006 - Aug 2010 For various clients : • Developed and maintained Ruby on Rails web applications to enable users to store and access data on demand • Designed and implemented a RESTful API utilizing Ruby on Rails and Postgresql to integrate with existing systems • Leveraged TDD to ensure the quality and correctness of the code • Created custom rake tasks to automate routine tasks • Collaborated with other developers to ensure the application was up to industry standards Education BS in Computer Science UC Berkeley - Berkeley 2001 - 2005 mailto:arthur.walker@dummym.com https://www.linkedin.com/ https://www.github.com/
Here with Tika:
- Every word is present in the final text
- The content of the two columns is mixed up. Though it isq slightly better than with the Poppler library. The parser starts by showing the data from the side column. It also parses data that is hidden in the Poppler library such as email address or links' Urls. And Experience and Education sections content is not mixed up with another section.
Apache tika is giving better results than Poppler especially considering the main sections Experience and Education where content is not mixed up with another column and their content reproduced faithfully. In case an ATS is using a semantic analyzer it is more likely to produce good results.
Tool 3: PyPDF2
PyPDF2 is also a popular tool but is giving mixed results on our resume. Experience section content is preserved yet it seems to have difficulties parsing sections with shorter contents such as lists. Also it is not respecting the flow as good as previous tools. Exerpt :
ARTHUR WALKER Crocker, San Francisco, USA +1 (555) 555 5555 arthur.walker@dummym.com English Native JapaneseStarted learning in 2020 LinkedIn GithubGoogle -San F rancisco Twitter Freshworks Self emplo yed Scala Softwar e Engineer Contact Address : Phone : Email : Languages Links Skills Scala
Tool 4: Parsr
Parsr is a tool based on PDFMiner and uses some OCR tools on top of it, such as Tesseract. It is developed by french company AXA and can be found here: https://github.com/axa-group/Parsr
It is a modular and highly customizable tool making it slightly more difficult and probably less versatile than other tools tested above. It was difficult to get consistent results for our resume and the best results we could obtain have not exceeded TIKA or PDFTOTEXT.
Conclusion
Parsing PDFs is notoriously tricky. Most parsers struggle to interpret documents like a human would. However, three out of four parsers can still display every word as it appears on the original resume. Impressively, all words from the Experience and Education sections are faithfully reproduced, and one parser, Tika, even manages to capture both sections in full.
With advanced tools like the ADOBE API and recent breakthroughs in Artificial Intelligence, I'm confident that soon, any resume will be parsed accurately and without errors.
At CVdunk, we're all in on two-column resumes! This format fills some gap, offering unique advantages and potentially being the perfect fit for many candidates. Create your resume today with CVdunk and get the best of our single and two-columns resumes templates!