I got error message while running this following code to read the pdf file and extract text from it.
CODE:
import PyPDF2
pdfFileObject = open('pythonhelp.pdf', 'rb')
pdfReader = PyPDF2.PdfFileReader(pdfFileObject)
page=pdfReader.getPage(0)
print(page.extract_text())
ERROR MESSAGE:
Superfluous whitespace found in object header b'1' b'0'
Superfluous whitespace found in object header b'2' b'0'
Superfluous whitespace found in object header b'3' b'0'
Superfluous whitespace found in object header b'54' b'0'
Superfluous whitespace found in object header b'65' b'0'
Superfluous whitespace found in object header b'68' b'0'
Superfluous whitespace found in object header b'53' b'0'
Superfluous whitespace found in object header b'15' b'0'
Superfluous whitespace found in object header b'14' b'0'
Superfluous whitespace found in object header b'13' b'0'
Superfluous whitespace found in object header b'23' b'0'
Superfluous whitespace found in object header b'22' b'0'
Superfluous whitespace found in object header b'21' b'0'
Superfluous whitespace found in object header b'31' b'0'
Superfluous whitespace found in object header b'30' b'0'
Superfluous whitespace found in object header b'29' b'0'
Superfluous whitespace found in object header b'39' b'0'
Superfluous whitespace found in object header b'38' b'0'
Superfluous whitespace found in object header b'37' b'0'
Superfluous whitespace found in object header b'51' b'0'
Superfluous whitespace found in object header b'50' b'0'
Superfluous whitespace found in object header b'49' b'0'
Superfluous whitespace found in object header b'52' b'0'
SOLUTION:
Add parameter "strict=false". this resolves my problem.
pdfReader = PyPDF2.PdfFileReader(pdfFileObject, strict=False)
VIDEO GUIDE:
Post your comments / questions
Recent Article
- The request was aborted: Could not create SSL/TLS secure channel -Error in Asp.net
- FieldError: Cannot resolve keyword 'id' into field in Django project
- How to hide the ID field from the Django admin?
- It is impossible to add a non nullable field without specifying a default. Django error
- ImportError: cannot import name 'url' from 'django.conf.urls' - Django Error
- How to Enable Virtualization in BIOS Security Settings in Intel Processors For Android Studio?
- Dependency 'androidx.activity:activity:1.8.0' requires libraries and applications that depend on it.
- AttributeError: 'NoneType' object has no attribute 'get_text' - Python
Related Article