Skip to main content
Fix a "parameter vs return value" typo, per @wizzwizz4
Source Link
J_H
  • 43.3k
  • 3
  • 38
  • 158

Rather than offer some "str or None" narrative text for humans to read, prefer to put url:-> str | None: in the signature. Then everyone can read it, including mypy and pyrefly.

Rather than offer some "str or None" narrative text for humans to read, prefer to put url: str | None in the signature. Then everyone can read it, including mypy and pyrefly.

Rather than offer some "str or None" narrative text for humans to read, prefer to put -> str | None: in the signature. Then everyone can read it, including mypy and pyrefly.

added 129 characters in body
Source Link
J_H
  • 43.3k
  • 3
  • 38
  • 158

I get the sense that you wished to avoid depending upon lazynlp. But I wish that you had. More on that below. Then we would have an up-to-date TLDextract dep, plus justext, which I think is the big thing you wanted to evict from the deps and which seems a nice enough library to me. BTW, though it's not published on pypi, you can still use a GitHub repo URL to depend on lazynlp. You can even bake in a particular immutable commit hash.

After you (quickly) ship version 0.1.0, I urge you to consider using uv to manage dependencies listed in pyproject.toml, such as httpx. Consider adding a make install Makefile, or a shell script, that shows how to pull in deps and assemble a small text corpus.

I get the sense that you wished to avoid depending upon lazynlp. But I wish that you had. More on that below. Then we would have an up-to-date TLDextract dep, plus justext, which I think is the big thing you wanted to evict from the deps and which seems a nice enough library to me. BTW, though it's not published on pypi, you can still use a GitHub repo URL to depend on lazynlp.

After you (quickly) ship version 0.1.0, I urge you to consider using uv to manage dependencies listed in pyproject.toml, such as httpx.

I get the sense that you wished to avoid depending upon lazynlp. But I wish that you had. More on that below. Then we would have an up-to-date TLDextract dep, plus justext, which I think is the big thing you wanted to evict from the deps and which seems a nice enough library to me. BTW, though it's not published on pypi, you can still use a GitHub repo URL to depend on lazynlp. You can even bake in a particular immutable commit hash.

After you (quickly) ship version 0.1.0, I urge you to consider using uv to manage dependencies listed in pyproject.toml, such as httpx. Consider adding a make install Makefile, or a shell script, that shows how to pull in deps and assemble a small text corpus.

deleted 6 characters in body
Source Link
J_H
  • 43.3k
  • 3
  • 38
  • 158

I imagine those work properly? But it seems like you're working too hard. Why didn't the argparse deal with optional items for you already? When I import typer I always get appropriate CLI diagnostics displayed automatically, without jumping through such hoops.

This project should focus on its core value-add, which is managing large text datasets. To the extent that you can outsource any of the network minutiae to some well tested library that has already worked out the details, I encourage you to do thatso.

I imagine those work properly? But it seems like you're working too hard. Why didn't the argparse deal with optional items for you already? When I import typer I always get appropriate CLI diagnostics displayed automatically, without jumping through such hoops.

This project should focus on its core value-add, which is managing large text datasets. To the extent that you can outsource any of the network minutiae to some well tested library that has already worked out the details, I encourage you to do that.

I imagine those work properly? But it seems like you're working too hard. Why didn't argparse deal with optional items for you already? When I import typer I always get appropriate CLI diagnostics displayed automatically, without jumping through such hoops.

This project should focus on its core value-add, which is managing large text datasets. To the extent that you can outsource any of the network minutiae to some well tested library that has already worked out the details, I encourage you to do so.

Source Link
J_H
  • 43.3k
  • 3
  • 38
  • 158
Loading