1

I have a shell script which basically runs every day to create a list of files from previous day and count the number of requests inside the file.

For the purpose of creating list of files I use the find command as below

find ${Search_Path} -type f  -newer ./start-time \! -newer ./end-time |egrep '\.5500\.|\.5000\.' >IncomBQR.txt

In the past I faced a problem that it seems to be outputting multiple row

So I tried to fix it by creating a uniq list as below.

sort -u IncomBQR.txt>IncomBQR1.txt
cat IncomBQR1.txt>IncomBQR.txt
rm -f IncomBQR1.txt

But after a few month that also failed. Could you please help me debug the problem?

When It runs the command I get is

${Search_Path}/file1   
${Search_Path}/file1

where as I should be getting only one row for "file1"

However the strange thing is that when I manually run it finds only 1 row.

5
  • 2
    I think we need some examples of text. But since a unique sort does not weed out duplicates you may have empty spaces invisible to the eye, but easily detected by wc (word count) or similar. Commented Mar 13, 2016 at 16:45
  • @komenten : Thanks for providing the response The file name are below : /gtpfssharepath/ORGXXXX/processed/billRequest/myrquest-20160312.DAT in my output listing it looks like this /gtpfssharepath/ORGXXXX/processed/billRequest/myrquest-20160312.DAT |35 /gtpfssharepath/ORGXXXX/processed/billRequest/myrquest-20160312.DAT |35 I am wondering the problem might be due GPFS file system ? Commented Mar 13, 2016 at 18:49
  • Perhaps find also looks in the IncomBQR.txt and add grep-results of that file. Try redirecting the find output to a place in another directory tree like /tmp/IncomBQR.txt. Commented Mar 13, 2016 at 20:07
  • If manually running find only returns the file once (as it should) then there's a problem with your script. While post-processing the file might patch over the problem, it's not fixing it. Commented Mar 14, 2016 at 18:32
  • Does Search_Path contain multiple directory entries? Repeating a directory name will cause the duplicate output. Try debugging echo "Searcn_Path=$Search_Path;"; Commented Mar 14, 2016 at 20:27

2 Answers 2

1

You can simplify your find command and avoid the pipe to grep:

find "$Search_Path" -type f -newer ./start-time ! -newer ./end-time -name '*.5[50]00.*'

If you're seeing the same file returned multiple times, this is most likely a problem with your script, and not the result of find.

Sign up to request clarification or add additional context in comments.

Comments

0

You forgot to actually use "uniq" in your fixup attempt, sorting is not enough. You can do it in one invocation:

find ... | sort | uniq > IncomBQR.txt

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.