-1

I have this string that i scraped from an unversity website. I want to parse it into a table where each row would consist of strings before and after a colon,":".

This is the string.

'課程中文名稱 Title of Course in Chinese:論文 課程英文名稱 Title of Course in English:Thesis (Projects) 應修系級 Major:法律學系博士班2 , 授課教師 Instructor:****** 選修類別 Required/Elective:必 全半學年 Whole or Half of the Academic Year:半學年 學 分 Credit(s):0 學分 時 數 Hour(s):0 小時 (function(window, $) { var sheetID = "1qkUIt6x8ry7F-etZJLMNKmEtDr0mwYdV3RNWw8fmOko", // 試算表代號 gid = "0", // 工作表代號 sql = "select%20B,%20C,%20D,%20E,%20F%20where%20G%20=%20'M6106'", // SQL 語法 callback = "callback"; // 回呼函數名稱 $.getScript("https://spreadsheets.google.com/tq?tqx=responseHandler:" + callback + "&tq=" + sql + "&key=" + sheetID + "&gid=" + gid); window[callback] = function(json) { var rowArray = json.table.rows, colArray = json.table.cols, rowLength = rowArray.length, colLength = colArray.length, html = "", i, j, dataGroup, dataLength, colName = new Array(); for (i = 0; i < colLength; i++) { colName[i] = colArray[i].label.replace(/彈性授課方式\W/g,''); } for (i = 0; i < rowLength; i++) { dataGroup = rowArray[i].c; dataLength = dataGroup.length; for (j = 0; j < dataLength; j++) { if (!dataGroup[j]) { continue; } if(dataGroup[j].v == "Y") html += colName[j] + ","; else if(j == (dataLength - 2) && dataGroup[j].v !== null) html += colName[j] + "-" + dataGroup[j].v + ","; } //if (dataGroup[dataLength - 2].v !== null) { //html += colName[dataLength - 2] + "-" + dataGroup[dataLength - 2].v + ","; //} html = html.substring(0,html.length - 1); html += "
"; } $("#test").html(html); if(html != "") $("#highlight").show(); }; })(window, jQuery); 「請遵守智慧財產權」及「不得非法複製及影印」。授課老師尚未建置課程大綱,若有需要請直接洽該任課教師!'

I tried to remove the javascript from this stack overflow page

An adhoc algorithm that i tried was just iteratively pairing the splitted string by every 2 element. This is the code.

spl = "the string"
spl = [spl[i:i + 2] for i in range(0, len(spl), 2)]

I do know that i can access alot of data if i execute the javascript from the browser doms. My question is how can i first parse out the javascript then parse the remaining string into a table?

4
  • Can you please edit your question and put there raw HTML, not just string without tags? Commented Jul 22, 2021 at 7:27
  • i have the url? is that okay? i don't have the html. Commented Jul 22, 2021 at 9:00
  • And what is the URL? Commented Jul 22, 2021 at 9:35
  • sea.cc.ntpu.edu.tw/pls/dev_stud/… this one. sorry for the late reply Commented Jul 23, 2021 at 6:01

1 Answer 1

1

Try:

import requests
from bs4 import BeautifulSoup

url = "https://sea.cc.ntpu.edu.tw/pls/dev_stud/course_query.queryGuide?g_serial=U1382&g_year=109&g_term=2&show_info=part"
soup = BeautifulSoup(requests.get(url).content, "html.parser")

for tr in soup.body.table.select("tr"):
    print(tr.get_text(strip=True))
    print("-" * 80)

Prints:

...
--------------------------------------------------------------------------------
課程中文名稱 Title of Course in Chinese:大學英文1B課程英文名稱 Title of Course in English:College English應修系級 Major:語文通識1  ,中國文學系1  ,歷史學系1  ,休閒運動管理學系1  ,法律學系財經法組1  ,法律學系法學組1  ,法律學系司法組1  ,授課教師 Instructor:殷雅玲選修類別 Required/Elective:必向度類別 Classification:全半學年 Whole or Half of the Academic Year:全學年學  分 Credit(s):2學分時  數 Hour(s):2小時
--------------------------------------------------------------------------------
彈性授課方式:
--------------------------------------------------------------------------------
教師網址 Instructor's Website :
--------------------------------------------------------------------------------
教師專長 Instructor's Specialty :英語教學
--------------------------------------------------------------------------------
課綱附檔 Attachments :
--------------------------------------------------------------------------------
先修科目 Prerequisites:High school English
--------------------------------------------------------------------------------

...and so on.
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.