• 0 Posts
  • 38 Comments
Joined 2 years ago
cake
Cake day: March 23rd, 2024

help-circle
  • I’ve written a python/Qt script that keeps lists like those you mention. It’s sort of a poor-man’s database manager, but its only database is a collection of csv files.

    • You can use an address list to keep track of contacts’ phone numbers, mailing addresses, and eMail addresses.

    • You can use a calendar to remind you about events and appointments including date, time, and duration. You can add notes about finding the location and other prerequisites to attendance.

    • You can keep separate passwords in a password list for every website you visit and every piece of gear you own.

    • You can keep links to favorite websites in a bookmark list.

    When I try to include a direct link to my python script, which does that, my responses and in fact the whole posted discussion are taken down. … something to do with self promotion of untested software I suppose. But you can find it in the Cheese Shop (See Wikipedia “Python Package Index.”) under tonto2.


  • drops on a lotus leaf

    Here’s a strategy for scoring your own search results.

    “Keywords” are the seven words most commonly occurring on the page. If these seven words are seen to be repeated on the page to an unusual degree, then it is a good assumption that the page was designed by the author to appear high on search results.

    Keyword density is a measure of “gloss.” Most people will read pages with high keyword density as unusually glossy. Keyword density is not necessarily related to how genuine the page content appears to be otherwise, but most people will look askance at a page that is too glossy.

    It should come as no big surprise that the pages that appear high on search results have been designed that way. They are deliberately glossy with high keyword density. You may consider whether to skip reading them or even loading them in your browser. Chances are good that the glossy pages are mostly advertising.

    Generally you will find interspersed in your results a handful of sites with low keyword density. These are likely from universities, government sites, and research institutions that have sources of revenue beyond advertising. You may consider whether to load these up and skim through them. Probably they will show a publication date, author, and list of references, which will move your research forward.

    It can be noted that AI-generated sites often exhibit high keyword density. This is probably deliberate so that they garner advertising revenue. However, it may also be due to “bot 'splaining,” which is polly-paraphrasing a series of several (perhaps contradictory) articles.

    Keyword density is not the only measure of gloss. There are others that have been developed to measure ratios between parts of speech. Unfortunately NONE OF THESE — including keyword density — distinguish sharply between pages that naturally convey genuine information and pages that have been designed to convey fluff for ulterior purposes. It is unlikely that combining measures of gloss will result in a tool that discriminates much better than keyword density by itself.

    • Piskorski, Jakub, Marcin Sydow, and Weiss Weiss. “Exploring Linguistic Features for Web Spam Detection: A Preliminary Study.” Airweb '08: Proceedings of the 4th International Workshop on Adversarial Information Retrieval on the Web. Ed. Carlos Castillo, Kumar Chellapilla, and Dennis Fetterly. New York: ACM, Apr. 2008. 25-28. ISBN:9781605581590. DOI:10.1145/1451983. 09 Nov. 2025 https://users.pja.edu.pl/~msyd/lingFeat08draft.pdf.

    Nevertheless, you may wish to explore keyword density as a means to rank search results.

    When I try to include a direct link to my python scripts, which do that, my responses and in fact the whole posted discussion are taken down. … something to do with self promotion of untested software I suppose. But you can find them in the Cheese Shop (See Wikipedia “Python Package Index.”) under clanker_score.

    We don’t want to make this too easy for just anyone to censor all his search results. Rather, these scrips are meant as a learning tool. They demonstrate generally how rotten search results can be on one particular and not very compelling dimension. It should not be necessary to download and scan each and every page. You should be able to train yourself to ignore a priori results that include handfuls of pages from unauthoritative sites.


  • Boy howdy, do I have just the script for you!

    https://pypi.org/project/clanker_score/

    Full disclosure: It doesn’t work. But the idea is nice: … that you could — perhaps in real life — identify AI-generated content. … so I wrote a framework that purports to do that.

    Keyword density is not the only measure of gloss. There are others that have been developed to measure ratios between parts of speech. Unfortunately none of these distinguish sharply between pages that naturally convey genuine information and pages that have been designed to convey fluff for ulterior purposes. It is unlikely that combining measures of gloss will result in a tool that discriminates much better than keyword density by itself.

    • Piskorski, Jakub, Marcin Sydow, and Weiss Weiss. “Exploring Linguistic Features for Web Spam Detection: A Preliminary Study.” Airweb '08: Proceedings of the 4th International Workshop on Adversarial Information Retrieval on the Web. Ed. Carlos Castillo, Kumar Chellapilla, and Dennis Fetterly. New York: ACM, Apr. 2008. 25-28. ISBN:9781605581590. DOI:10.1145/1451983. 09 Nov. 2025 https://users.pja.edu.pl/~msyd/lingFeat08draft.pdf.





  • I recommend my python script, Tonto2.

    What does Tonto2 do?

    It keeps lists.

    You can use lists to keep in touch with family, friends, and cow-orkers.

    Tonto2 keeps four kinds of lists:

    • You can use an address list to keep track of contacts’ phone numbers, mailing addresses, and eMail addresses.

    • You can use a calendar to remind you about events and appointments including date, time, and duration. You can add notes about finding the location and other prerequisites to attendance.

    • You can keep separate passwords in a password list for every website you visit and every piece of gear you own.

    • You can keep links to favorite websites in a bookmark list.

    Additionally you can make a list of bibliographic entries for writing research papers and for saving well-formatted footnotes for Web sites, but this is an arcane topic that will probably not be of general interest.

    The information in these lists is at your fingertips.

    You own it, and you can keep it. You can share it piecemeal with other people and computers without having to trust anyone or any thing with the whole enchilada. This is the idea of Tonto2.


  • Exactly! I harbor nostalgia for the old Windows 3 desktop icon grid, so I open a file manager window pointing to ~/Desktop and display the *.desktop shortcuts there as icons. This is done automatically when gdm starts. My file manager is PCManFM, which is a rip-off of nautilus. Double-clicking on an icon opens the shortcut — be it to a terminal or a graphical application. I have to alt-tab to the PCManFM window of course, so I need the keyboard. Then I have to double-click with the mouse. It’s keeping both hemispheres of the brain active: subject/verb, left/right. Presumably you can map your game controller’s buttons to keyboard equivalents like <right cursor>, <tab>, and <enter> (or map your game controller’s buttons to PCManFM’s hot key config), which would allow you to navigate the PCManFM icon grid.



  • I hit the super-key, type terminal, hit enter

    I harbor nostalgia for the old Windows 3 desktop icon grid, so I open a file manager window pointing to ~/Desktop and display the *.desktop shortcuts there as icons. This is done automatically when gdm starts. My file manager is PCManFM, which is a rip-off of nautilus. Double-clicking on an icon opens the shortcut — be it to a terminal or a graphical application. I have to alt-tab to the PCManFM window of course, so I need the keyboard. Then I have to double-click with the mouse. It’s keeping both hemispheres of the brain active: subject/verb, left/right.

    then I have a terminal which does not start maximized on workspace 1

    I run devilspie in the background to catch windows of certain applications such as terminal and maximize them on the fly. For this reason, I must disable wayland.

    Does the vanilla Gnome workflow expect you to use mouse and keyboard?

    Yes, both, apparently.

    It just seems like a lot of work/clicks/keys to achieve something simple.

    Well, that’s what you get for downplaying the role of icon grids.


  • The ideological issue (which you probably don’t care about) is that it pretty much requires proprietary (non-FOSS) drivers which run in kernel space and so in theory have complete access to all data on your computer (but then so does Intel ME). This is the main reason I personally will never use NVidia cards.

    The only meltdown I’ve had with Linux occurred on a minor rev-level update to Debian that plugged some hole in the kernel the NVidia proprietary driver was crawling through. I had used Debian and an NVidia proprietary driver for years on an ancient motherboard. Then suddenly that “solution” disappeared. I had to replace the whole machine. Yeah, it was time. No, I wasn’t ready. I don’t know whether I should have been more pissed at Debian or NVidia, but I’m still on Debian. After the kernel update, X11 reverted to a default driver, and no install, uninstall, reinstall combination of the proprietary drivers seemed efficacious. I’m sorry I don’t remember the exact software rev-levels and drivers involved. All notes I took at the time, if any, were lost in the subsequent crash and recovery from incompetently trying to roll back the kernel update.


  • This is probably NOT what you had in mind. What I use for launching apps under Gnome 43.9 is a traditional file manager. Historically, nautilus was Gnome’s file manager. I note that Gnome still has a file manager, but they don’t call it that. Over time nautilus has been gutted of a lot of its functionality. Thus, I have switched to PCManFM, which is a lightweight lookalike. I autostart it in my Desktop folder, which holds a handful of *.desktop shortcut files. I like the look of the “Icon” view mode because it reminds me of the old Windows 3.1 desktop. Alas, there is no grouping like what you’re hoping for (so far as I know), but you could create shortcuts to other *.desktop folders. PCManFM displays a tabbed window, and you can drag and drop icons onto folders on a window, and between tabs. I launch apps by double-clicking icons.



  • I’m not aware of any service that [goes fully peer-to-peer] while being practical for most people, yet.

    Retroshare is almost ready for prime time after remaining in development for over 20 years. Each “friend” runs it’s own service for the decentralized network of “friends” and hands off message fragments from immediate “friends” for swapping files, store-and-forward messages, chats, etc., to other more distant network participants.

    The swindle is that your friends know you by your IP address. If Big Government, Big Media, or Big Crime knocks over one of them, they’ve got you, too. But — not to worry — you can actually — so I’m told — run an RS instance behind a TOR hidden service.

    I much prefer the article from 22 Mar 2019 about “TOR Onion Services” preserved at the Wayback Machine instead of the current article.



  • You’re required to provide full personal details to be hired to an employer with dubious security.

    I don’t know, but I’ve been told…

    You MAY THINK you’re submitting an application directly to an employer’s Personnel Office on that employer’s Web site, but you’re actually submitting your application to that employer’s contracted head hunter — hence the junk mail because that head hunter has other clients to recruit for. It’s the lack of transparency that gripes me.

    … so the head hunter has to use restrictive filters on applications they relay to all their clients because they can’t rely on the applicant to vet employers they’d be interested in beforehand. These restrictive filters reject applicants for silly reasons like not having experience with every single piece of software on an arbitrary list of brand names.

    There is no sunset date to an application made through a third party. The head hunter and his clients will continue to bug you in perpetuity.

    They will continue to bug you about nonexistent openings. Just as they can sometimes find positions for people who are not actually looking for employment, they can sometimes place people with employers who have no open positions. It seems worth their while to try. After all, you MAY STILL BE in the market … sort of.

    Employers and their head hunters continue to recruit for positions that have already been filled. This is the old “open requisition” problem. They aim to cover the risk that their new hires won’t pan out.

    The more positions you apply for, the more head hunter databases you appear in. All their job-application software is incompatible, so you have to reapply and reapply and reapply, but it all seeks the same information: Are you currently employed? If not, they don’t know you.




  • Alternatively you can use a spreadsheet and generate lists there.

    OK, I’m going to wade in here. It occurs to me that the OP could make use of my Tonto2 Python3 script for Linux and Windows. It puts a spreadsheet-like user interface over a *.csv file or files. You just need to make a home for the tag file(s). You can make bookmark lists that way and open the embedded http:// links in your browser. You could use file:/// links for local images. You could add as many columns as you want for all kinds of tags and sort and search the values to your heart’s content.