0

Example of my dataset "doc name with spaces.csv" (with anonymized data) file that has multiple lines. Length of file will be variable from day to day as part of an export.

Patient Full Name   Order Date Of Service   Order Accession Number  Day of Patient Birth Date   Procedure Description   Facility Name   
AAAAA, Ms Joan  10/11/2022  xx.1111111  1 November 2000 Ultrasound Obstetric 22+ Weeks  Facility 1  
BBBBB, Mr John  10/11/2022  xx.2222222  2 July 2000 Ultrasound Left Calf    Facility 2  
CCCCC, Mrs Anne 10/11/2022  xx.3333333  3 July 2000 X-ray Chest Facility 3  
DDDDD, Master Jack  10/11/2022  xx.4444444  4 July 2000 Ultrasound Left Ankle   Facility 4
....

Trying to create a BATCH script to

  1. Read each Line of "doc name with spaces.csv"
  2. Delete all occurrences of strings matching lines found in "titles.txt" (located in same directory)
  3. Delete first TAB (\t) found per line, and everything after it on same line.
  4. Copy results to Windows clipboard

Example:

AAAAA, Ms Joan  10/11/2022  xx.1111111  1 November 2000 Ultrasound Obstetric 22+ Weeks  Facility 1
BBBBB, Mr John  10/11/2022  xx.2222222  2 July 2000 Ultrasound Left Calf    Facility 2  

to

AAAAA, Joan
BBBBB, John

NB: The title is always followed by a white space, so no risk of removing Dr or Mr etc from a name, if the white space is accounted for in the find/delete. Content of "titles.txt" below:

Mrs 
Mr 
Miss 
Ms 
Dr 
Prof 
A/Prof 

Taken a look at other scripts online, but none quite match what I'm doing. Also a bit advanced for where I am currently at, but the need for this has arisen regardless.

1
  • 1
    Please provide enough code so others can better understand or reproduce the problem. Commented Nov 11, 2022 at 13:30

1 Answer 1

0
@ECHO OFF
SETLOCAL
rem The following settings for the directories and filename are names
rem that I use for testing and deliberately include names which include spaces to make sure
rem that the process works using such names. These will need to be changed to suit your situation.

SET "sourcedir=u:\your files"
SET "filename1=%sourcedir%\q74397743.txt"
SET "filename2=%sourcedir%\q74397743_2.txt"
SET "destdir=u:\your results"
SET "outfile=%destdir%\outfile.txt"

(
FOR /f "usebackqskip=1delims=" %%e IN ("%filename1%") DO @CALL :process %%e
)>"%outfile%"
TYPE "%outfile%"|clip

GOTO :EOF

:process
:: first parameter = patient_id
SET "patient_id=%1
SET "patient_name="
SHIFT
:: Second parameter = Title
:: skip if on titles list
FINDSTR /i /x "%1" "%filename2%">NUL
IF NOT ERRORLEVEL 1 SHIFT
:: Build name until %1 begins with a numeric
:nameloop
SET "nextpart=%1"
SET "firstchar=%nextpart:~0,1%"
FOR /L %%z IN (0,1,9) DO IF "%firstchar%"=="%%z" ECHO %patient_id%,%patient_name%&GOTO :eof
SET "patient_name=%patient_name% %nextpart%"
SHIFT
GOTO nameloop
GOTO :eof

Always verify against a test directory before applying to real data.

Note that if the filename does not contain separators like spaces, then both usebackq and the quotes around %filename1% can be omitted.

You don't indicate where the Tabs are. Master missing from titles file. Spaces removed from end-of-line in titles file.

Assumed that since the date follows the name, finding a field that starts with a numeric is sufficient for end-of-name.

Surnames missing despite column name "full name"

Simply read each line and extract first token, optionally skip second then build together next until numeric character found. Use the comma, spaces and tabs as separators for the subroutine parameters.

Sign up to request clarification or add additional context in comments.

1 Comment

Hi Magoo, testing your answer out. Haven't forgotten to up-vote if it helps :)

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.