1

I'm processing a csv file in awk. During the script, I need to exit awk for a little while, go and do some ImageMagick processing in the bash shell based on the information awk processed in the csv file. I'm trying to do this image processing in the shell because I loop through all the jpg files in the nominated directory. My question is twofold:

  1. How can I pass variables from awk to bash so that I can do this? (this is $imageDirectory and $productRef in the code below)
  2. I'm trying to avoid lots of system(someImageMagicCommand) type of code, because it looks like I'm trying to use awk to do something it's not designed to do. Is there a better approach?

Here's the pseudocode sample that illustrates what I'm trying to achieve:

    #blah blah blah awk code
    #leave awk interpreter, go into bash
    '
    resizesize="200x200";
    #concatenate with *.jpg suffix for listing all the jpg files in the imageDirectory
    imageDirectoryWithSuffix="$imageDirectory/*.jpg";
    for i in `ls $imageDirectoryWithSuffix`
    do
            #imageMagick converts large images to thumbnails
            convert $i $resizeSize -otherFlagsEtc assets/$productRef/thumbs/$i
    done
    '#back into awk, more csv processing now...

Clarifying context: I have a csv file full of product information and associated file paths (1 product per line). I'm trying to automate the creation of webpages about the products. Part of this involves resizing a directory of images, the location of which is given as a field in the csv file. Hence, part way through the script, I'm trying to leave the awk interpreter, invoke ImageMagick to resize, and then return to the awk interpreter at the same record in the csv file and keep outputting HTML files.

There's about 100 (long) lines of script code before I get to the ImageMagick part and another hundred afterwards, so based on @Bushmills answer I think the best thing would be to write the awk variables I'll need in bash to a small temp file, then exit awk and read in the temp file from bash. However, how do I reinvoke awk and get it to start reading at the same record where it left off? Or do I just have to stay in awk and use a system() call? It doesn't seem sensible to wrap an entire bash for loop in an awk system() function, but I can't think of a more elegant way of calling ImageMagick on an entire directory of files.

2
  • 1
    You are showing us the middle of something that doesn't work because of what is around it - kind of hard to help really. I think you might do better if you step back and tell us the overall problem and your current approach, with more code. Commented Jul 17, 2014 at 11:44
  • @MarkSetchell I've updated the question with the context. Commented Jul 18, 2014 at 7:42

1 Answer 1

1

A fifo may be a good way for you to proceed. The basic idea is you create a tunnel out of awk and into your ImageMagick stuff and you pass requests out of awk through the tunnel and into ImageMagick.

So your main script could do this:

#!/bin/bash
...
mkfifo tunnel
ImageMagickScript &
...
awk '{...
      ...
      print directory size> "tunnel"
      ...
      ... } file
...
wait # for ImageMagick script that we started to finish

And your ImageMagick script could do this

#!/bin/bash
while read directory size
do
    convert ... $directory $size ...
done < tunnel
Sign up to request clarification or add additional context in comments.

4 Comments

Worked very well, thank you. It took a little more effort than I anticipated due to another issue; I apologise for the tardiness. I'd also appreciate a comment on why I need to wait on the final line of the main script, because I've never had to program with named pipes before, and whether perl is more a more appropriate language to do this sort of thing long term (as I understand that's what you did before moving to wordpress and you must have had your reasons)
@Escher There is no absolute need to have the wait, it just waits for the ImageMagick script that we started in the background on line 4 to finish. I did it in case you were running some timing tests, and your work is technically not complete until the ImageMagick part is finished processing too, so without the wait it would give you a misleading idea of how long it takes - that's all.
@Escher As regards a more appropriate way of doing it, I think the shell is fine. If you have 20,000+ files, I would consider using GNU Parallel to do the ImageMagick stuff so that all your 4, or 8, lovely Intel cores get busy... something like parallel convert {} -resize ... -strip {.}new.jpg ::: *.jpg
I recently had to do about 400 images to 3 different sizes. Surprisingly, when I ran on 4 cores instead of 1, it only halved the processing time...

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.