0

I am new to python and have been trying to transform a XML file using XSLT. This is a python script that I came up with, generate_ioc.py

import os
import sys
import time
import shutil
import logging
import Queue
import ftplib
import subprocess
import re
import lxml.etree as ET
try:
    report_path = os.path.join("/home",
                "user",
                "Desktop",
                "stix_to_openioc")
stix_report = os.path.join(report_path, "report.stix.xml")
print("Retrieve STIX report successful")

# Sanitize report (oxb, remove xlms)
with open (stix_report, "r") as stix_file:
    stix_xml = stix_file.read()
stix_xml = re.sub(u"xmlns='https://github.com/STIXProject/schemas/blob/master/stix_core.xsd' ", u"", stix_xml)
RE_XML_ILLEGAL = u'([\u0000-\u0008\u000b-\u000c\u000e-\u001f\ufffe-\uffff])' + \
             u'|' + \
             u'([%s-%s][^%s-%s])|([^%s-%s][%s-%s])|([%s-%s]$)|(^[%s-%s])' % \
              (unichr(0xd800),unichr(0xdbff),unichr(0xdc00),unichr(0xdfff),
               unichr(0xd800),unichr(0xdbff),unichr(0xdc00),unichr(0xdfff),
               unichr(0xd800),unichr(0xdbff),unichr(0xdc00),unichr(0xdfff))
stix_xml = re.sub(RE_XML_ILLEGAL, "?", stix_xml)

print("Sanitize STIX report successful")

# Save sanitized report to file
tree = ET.XML(stix_xml)
with open(stix_report, "w") as stix_file:
    stix_file.write(ET.tostring(tree))
print("Save STIX report successful")

# Get xsl file
    xslt_path = os.path.join("/home",
                "user",
                "Desktop",
                "stix_to_openioc",
                                "stix_to_openioc.xsl")
print("Retrieve XSL file successful")

#Perform xsl tranformation
#dom = ET.parse(stix_report)
#xslt = ET.parse(xslt_path)
#transform = ET.XSLT(xslt)
#newdom = transform(dom)
#xslt_xml = ET.tostring(newdom, pretty_print=True)
#print("XSL transformation successful")
#print(xslt_xml)
#I have tried this ^ but it resulted in the same error 

from lxml import etree
f_xsl = 'stix_to_openioc.xsl'
f_xml = 'report.stix.xml'
f_out = 'report.ioc.xml'

transform = etree.XSLT(etree.parse(f_xsl))
result = transform(etree.parse(f_xml))
result.write(f_out)

# Get new stix file
openioc_report = os.path.join(report_path,
                "report.openioc.xml")
print("Retrieve OpenIOC report successful")

# Save stix report
with open(openioc_report, "w") as openioc_file:
    openioc_file.write(xslt_xml)
print("Save OpenIOC report successful")

except OSError as e:
log.warning("Error accessing stix report (task=%d): %s", self.task.id, e)

Error Code :

/usr/bin/python2 /home/user/Desktop/IOC/generate_ioc.py
Retrieve STIX report successful
Sanitize STIX report successful
Save STIX report successful
Retrieve XSL file successful
Traceback (most recent call last):
File "/home/user/Desktop/IOC/generate_ioc.py", line 53, in <module>
transform = ET.XSLT(xslt)
File "xslt.pxi", line 403, in lxml.etree.XSLT.__init__ (src/lxml/lxml.etree.c:122894)
lxml.etree.XSLTParseError: xsltParseStylesheetProcess : document is not a stylesheet

Process finished with exit code 1

I would like to know what have I gone wrong that resulted into this. Note : I am quite new to python and XML so any advice would be appreciated, I am willing to learn and spend time on correcting this error. Note2 : The stix_to_openioc.xsl is present in the correct directory.

1 Answer 1

1

Your doing some clean up on the xml and saving it into a file whose path string is stix_report:

with open(stix_report, "w") as stix_file:
    stix_file.write(ET.tostring(tree))
print("Save STIX report successful")

Then you create a string path to an xslt file:

xslt_path = os.path.join("/home",
            "user",
            "Desktop",
            "stix_to_openioc",
            "stix_to_openioc.xsl")
print("Retrieve XSL file successful")

But then you load up xml and xsl files from new variables:

from lxml import etree
f_xsl = 'stix_to_openioc.xsl'
f_xml = 'report.stix.xml'
f_out = 'report.ioc.xml'

transform = etree.XSLT(etree.parse(f_xsl))
result = transform(etree.parse(f_xml))
result.write(f_out)

I can't guarantee that it will work (since I have no idea what's in these files), but I think a good start here will be to change this code:

from lxml import etree
f_xsl = xslt_path
f_xml = stix_report
f_out = 'report.ioc.xml'

transform = etree.XSLT(etree.parse(f_xsl))
result = transform(etree.parse(f_xml))
result.write(f_out)
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.