You can upload a PDF, or multiple PDFs, to scrape the text for analysis, break the text into sentences and then index those sentences into a semi-structured data frame at the sentence level.
How to Upload a PDF
You can use the PDF uploader on the Data Streams page as we do here, or directly in a workspace. This allows you to scrape the text from the files and tag as needed.
- From your workspace, in the navigation panel to the left, select the Data tab.
- From the Data page, click the (+) icon in the lower right corner. It will expand to say + New Data.
- In the Create New Data Stream modal that appears, under the Upload category, select PDF File.
- Select the file(s) you would like to upload. The file extension must be .pdf. Click Next.
- Name your data stream, add a description, relevant tags, or share with team members. Click Submit once you’re ready to import the data.
- Your file will begin to ingest. It may take some time to process.
The following columns will be created from your raw data and available for analysis:
Create_date
File_name
Scrape_date
Sentence_ID
Text
Further questions?
We're here to help! Don't hesitate to contact us for further assistance via chat or submit a ticket!
Comments
0 comments
Please sign in to leave a comment.