1

i would like parse this xml kind file:

<?xml version="1.0" encoding="utf-8"?>
<SolarForecastingChartDataForZone xmlns="http://schemas.datacontract.org/2004/07/Elia.PublicationService.DomainInterface.SolarForecasting.v3" xmlns:i="http://www.w3.org/2001/XMLSchema-instance">
    <ErrorMessage i:nil="true"/>
    <IntervalInMinutes>15</IntervalInMinutes>
    <SolarForecastingChartDataForZoneItems>
    <SolarForecastingChartDataForZoneItem>
        <DayAheadForecast>-50</DayAheadForecast>
        <DayAheadP10>-50</DayAheadP10>
        <DayAheadP90>-50</DayAheadP90>
        <Forecast>0</Forecast>
        <ForecastP10>0</ForecastP10>
        <ForecastP90>0</ForecastP90>
        <ForecastUpdated>0</ForecastUpdated>
        <IntraDayP10>-50</IntraDayP10>
        <IntraDayP90>-50</IntraDayP90>
        <LoadFactor>0</LoadFactor>
        <RealTime>0</RealTime>
        <StartsOn xmlns:a="http://schemas.datacontract.org/2004/07/System">
            <a:DateTime>2013-09-29T22:00:00Z</a:DateTime>
            <a:OffsetMinutes>0</a:OffsetMinutes>
        </StartsOn>
        <WeekAheadForecast>-50</WeekAheadForecast>
        <WeekAheadP10>-50</WeekAheadP10>
        <WeekAheadP90>-50</WeekAheadP90>
    </SolarForecastingChartDataForZoneItem>
    <SolarForecastingChartDataForZoneItem>
        <DayAheadForecast>-50</DayAheadForecast>
        <DayAheadP10>-50</DayAheadP10>
        <DayAheadP90>-50</DayAheadP90>
        <Forecast>0</Forecast>
        <ForecastP10>0</ForecastP10>
        <ForecastP90>0</ForecastP90>
        <ForecastUpdated>0</ForecastUpdated>
....

to recover the level <Forecast> and <a:DateTime>

I tried with beautiful soup and minidom, for example:

from xml.dom import minidom
xmldoc = minidom.parse('xmlfile')
itemlist = xmldoc.getElementsByTagName('Forecast')
print(len(itemlist)) #to get the number of savings
for s in xmldoc.getElementsByTagName('Forecast'):
    print s.nodeValue

But i can't have any value. I guess i'm wrong but i don't understand why. Someone could help me? Thank you

1 Answer 1

1

Not exactly sure what your desired output is but I was working with LXML and XPATH when I saw this question.

from lxml import html
mystring = ''' I cut and pasted your string here '''
tree = html.fromstring(mystring)
>>> for forecast in tree.xpath('//forecast'):
       forecast.text_content()

'0'
'0'
>>> for dtime in tree.xpath('//datetime'):
        dtime.text_content()


 '2013-09-29T22:00:00Z'
>>> 

and then to mess around a bit more

all_elements = [e for e in tree.iter()]
for each_element in all_elements[1:]:  # The first element is the root - it has all the text without the tags though so I don't want to look at this one
    each_element.tag, each_element.text_content()

('errormessage', '')
('intervalinminutes', '15')
('solarforecastingchartdataforzoneitems', '\n    \n        -50\n        -50\n        -50\n        0\n        0\n        0\n        0\n        -50\n        -50\n        0\n        0\n        \n            2013-09-29T22:00:00Z\n            0\n        \n        -50\n        -50\n        -50\n    \n    \n        -50\n        -50\n        -50\n        0\n        0\n        0\n        0')
('solarforecastingchartdataforzoneitem', '\n        -50\n        -50\n        -50\n        0\n        0\n        0\n        0\n        -50\n        -50\n        0\n        0\n        \n            2013-09-29T22:00:00Z\n            0\n        \n        -50\n        -50\n        -50\n    ')
('dayaheadforecast', '-50')
('dayaheadp10', '-50')
('dayaheadp90', '-50')
('forecast', '0')
('forecastp10', '0')
('forecastp90', '0')
('forecastupdated', '0')
('intradayp10', '-50')
.
.
.
Sign up to request clarification or add additional context in comments.

1 Comment

I think the main problem is solved. I will code to have the correct output. Thank You

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.