2

I use following awk script to do so,

for line in $1
do
 grep -F ".js" $1 | awk '{print $7}' | sort -u 
done 

the out put is almost there:

/blog/wp-includes/js/swfobject.js?ver=2.2
/fla/AC_RunActiveContent.js
/include/jquery.js
/include/jquery.jshowoff2.js
/include/jquery.jshowoff.min.js
/include/js/jquery.lightbox-0.5.js
/scripts/ac_runactivecontent.js

I tried piping: cut -d "/" -f5 intead of awk, but parts of script name are cut off as well.

ac_runactivecontent.js HTTP
AC_RunActiveContent.js HTTP
jquery.jshowoff2.js HTTP
jquery.jshowoff.min.js HTTP
jquery.js HTTP
js
wp-includes

How would I go about extracting from the pattern .js to the delimiter "/" so that I only get the script file name:

swfobject.js
AC_RunActiveContent.js
jquery.js
jquery.jshowoff2.js
jquery.jshowoff.min.js
jquery.lightbox-0.5.js
ac_runactivecontent.js

4 Answers 4

1

Probably going to be more efficient to look at replacing the current for/grep/awk/sort with a single awk (and optional sort).

Setup:

$ cat filename.js
1 2 3 4 5 6 /blog/wp-includes/js/swfobject.js?ver=2.2 8 9 10
ignore this line
1 2 3 4 5 6 /fla/AC_RunActiveContent.js 8 9 10
1 2 3 4 5 6 /include/jquery.js 8 9 10
ignore this line
1 2 3 4 5 6 /include/jquery.jshowoff2.js 8 9 10
1 2 3 4 5 6 /include/jquery.jshowoff.min.js 8 9 10
ignore this line
1 2 3 4 5 6 /include/js/jquery.lightbox-0.5.js 8 9 10
1 2 3 4 5 6 /scripts/ac_runactivecontent.js 8 9 10

One awk idea:

awk '
/.js/ { n=split($7,a,"[/?]")          # split field #7 on dual characters "/" and "?", putting substrings into array a[]
        for (i=n;i>=1;i--)            # assuming desired string is toward end of $7 we will work backward through the array
        if (a[i] ~ ".js") {           # if we find a match then ...
           print a[i]                 # print it and break out of the loop ...
           next                       # by going to next input record
        }
      }
' filename.js

# or as a single line:

awk '/.js/ {n=split($7,a,"[/?]"); for (i=n;i>=1;i--) if (a[i] ~ ".js") { print a[i]; next}}' filename.js

This generates:

swfobject.js
AC_RunActiveContent.js
jquery.js
jquery.jshowoff2.js
jquery.jshowoff.min.js
jquery.lightbox-0.5.js
ac_runactivecontent.js

NOTE: OP can pipe the results to sort if desired

Sign up to request clarification or add additional context in comments.

1 Comment

Thank you @markp-fuso your answer let me understand the "awk" command better and its versatility.
1

Using awk you could print the match for the filename from the 7th column.

The pattern [^/]+\.js matches 1+ times any character except / followed by matching .js

Using for example a file as input:

awk '
match($7, /[^/]+\.js/) {
  print substr($7, RSTART, RLENGTH)
}
' file

Output

swfobject.js
AC_RunActiveContent.js
jquery.js
jquery.jshowoff2.js
jquery.jshowoff.min.js
jquery.lightbox-0.5.js
ac_runactivecontent.js

Comments

0

Since you are already using awk, the answer provided by @markp-fuso is probably your best option. If you are open to other options, you may be able to use a combination of grep and basename. (Note that this will likely be less efficient due to piping grep output to basename)

Using the sample file from the answer provided by @markp-fuso, the following:

grep -o ' /.*\.js' tt.dat | xargs basename

Produces the following output:

swfobject.js
AC_RunActiveContent.js
jquery.js
jquery.jshowoff2.js
jquery.jshowoff.min.js
jquery.lightbox-0.5.js
ac_runactivecontent.js

Comments

-1

Try

basename

and

man basename

command.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.