Hi, I like to learn about what resources are out there on the internet. I hope you have found my posts useful!
- 18 Posts
- 54 Comments
cll7793@lemmy.worldOPto
Ask Lemmy@lemmy.world•What are some tricks to efficiently search for information on the internet?English
1·8 months agoI’d highly recommend using ZIM to download the websites you want! (https://wiki.openzim.org/wiki/Build_your_ZIM_file)
Once downloaded, you honestly can probably get better results from basic notepad search than google/duckduckgo/bing.
cll7793@lemmy.worldOPto
Ask Lemmy@lemmy.world•What are some tricks to efficiently search for information on the internet?English
2·8 months agoSuper useful plugin! You can also subscribe to lists that block SEO/AI generated websites. Now only if there was a whitelist plugin that places forums higher up
cll7793@lemmy.worldOPto
Ask Lemmy@lemmy.world•What are some tricks to efficiently search for information on the internet?English
1·8 months agoSomeday this will be possible when an open source search engine comes around.
cll7793@lemmy.worldOPto
Ask Lemmy@lemmy.world•What are some tricks to efficiently search for information on the internet?English
2·8 months agoI noticed some of the best resources from the past are unfindable from any search engine. For example some science youtube channels which offer amazing quality content seem to be unfindable. They are replaced with other channels that try to clickbait their way to the top. The same can be said with websites that SEO as much as they can. The highest quality resources are also often in the least quantity. A form of quantity > quality is favored and amplified and sometimes even censored. (Anna’s archive)
cll7793@lemmy.worldOPto
Ask Lemmy@lemmy.world•What are some tricks to efficiently search for information on the internet?English
2·8 months agoIt’s quite sad that we are now at a point where we are forced to make our own search engines from scratch. Search engines are hard! Google’s original search algorithm (about 2 decades ago) was quite amazing. You were able to give vague search terms and yet still find the answer you wanted. The secret sauce was ranking based on relevance to the search query. I’m not aware of any guides/projects on search engines. I wish there was a good way I could search for this. (The irony!) But a great starting resource is this series on networks from wikipedia. (https://en.wikipedia.org/wiki/Network_theory)
Some random tips:
- The main goal of any search engine should be to minimize the number of times a user returns to the ranking page to click on a new link. Big tech should be doing this anyways but they have other goals.
- The main metadata database needs to topologically connect you to any part of the internet. (https://en.wikipedia.org/wiki/Graph_theory) Think of it as a hub/portal giving you general directions, but doesn’t tell you exactly where you should be heading. The ideal solution is to download everything from the internet and query each result for relevance to a search query individually, but this is intractable. Instead you have to group the internet into graphs and sub graphs - STEM, Social, Forums, E-commerce etc. Hyperlinks offer an objective way to calculate connections between websites. For example Lemmy.world <-> Wikipedia.org. The weight of these connections gives you a way to guide a traversal algorithm during search. Semantic analysis of some form allows you to find more efficient ways to draw connections making your search more efficient.
- The most powerful way to find connection/relevance to a search term is with transformers and their attention mechanism. For example if the search query is “Open source search engine”, the attention heatmap would be on groups of websites subjects like Forums, Q&A, Programming, Network Science, etc. There would also be a negative heatmap for topics like Cooking, Sports, Entertainment, etc. From there you want to load up recursively metadata for websites. For example for Lemmy it would be the title of all posts (and maybe their top comments). If it fits, load as much of this as you can into a transformer and calculate the heatmap relative to the search query. Again you are not using the transformer to generate answers. This is a bad idea. Instead you are using it to rank search results in terms of relevance/attention, what the transformer is fundamentally designed for.
As a side note, you are able to tune your model to your own search preferences with little data. You are also able to exchange computation time for search quality! This is amazing. If computation is a concern, traditional traversal algorithms and basic relevance/ranking algorithms work too but at the cost of more engineering.
I hope this sorta helps, if you have any other question feel free to ask! The future of search will likely be self-hosted as conflicts of interest within current search engine providers degrades the quality to the point where they are unusable.
cll7793@lemmy.worldOPto
Ask Lemmy@lemmy.world•What are some tricks to efficiently search for information on the internet?English
1·8 months agoFinding the balance between what to keep to index is hard! The attention mechanism in transformers should be pretty good at ranking results. The idea is to feed into context titles, top answers, etc in bulk along with a search query. The attention heatmap relative to the search gives you a general rank for how good each result is. Ironically enough, this is probably the most powerful indexer, yet no big tech uses it and instead has the model generate answers instead of ranking them. The best part is, this system is tunable and can be adjusted to user preference with little data. The overall goal should be to minimize the number of results a user checks. (This should be what other engines are doing in the first place)
cll7793@lemmy.worldOPto
No Stupid Questions@lemmy.world•What are some powerful open source projects everyone should know?English
2·2 years agoGlad you found it awesome! :)
cll7793@lemmy.worldOPto
Ask Lemmy@lemmy.world•What are some powerful open source projects everyone should know?English
1·2 years agoThank you! I didn’t know about them!!
cll7793@lemmy.worldOPto
Ask Lemmy@lemmy.world•What are some powerful open source projects everyone should know?English
0·2 years agoI know right? Open source hardware has so many potential benefits over commercial. Significantly decreased price, privacy, good documentation, right to repair, no conflict of interest and potentially one day performance. Imagine we have engineers from across the world improving a single computer chip design, generator design, solar panel fabrication process, or maybe even perhaps an open source fusion reactor blueprint someday in the next 20 years (pun intended).
I’m seriously considering starting something like this myself. Open source blueprints for power generation/energy storage (regular batteries, thermal sand resevior based batteries, hydro power generation), water filtration, machine tools for fabricating anything, CNC machine, plasma cutters, hand tools, etc. Basically everything you could need to live Open Source.
The problem as always is getting enough designers, engineers, and volunteers.
cll7793@lemmy.worldOPto
No Stupid Questions@lemmy.world•What are some powerful open source projects everyone should know?English
31·2 years agoThanks! I wonder if there are any websites hosting open source 3d models. Also thought I’d drop this resources too!
FreeCAD (Open source CAD modeler): https://wikipedia.org/wiki/FreeCAD
cll7793@lemmy.worldOPto
No Stupid Questions@lemmy.world•What are some powerful open source projects everyone should know?English
41·2 years agoSorry didn’t mean to cause any trouble. I collect and share internet resources with others. If you want to verify this for yourself, my post history has questions similar to this one. I removed the image to make this post more ‘generic’. I am genuinely trying to share resources. My apologies if it came across as advertising.
cll7793@lemmy.worldOPto
No Stupid Questions@lemmy.world•What are some powerful open source projects everyone should know?English
73·2 years agoNo it’s not. This was one example of a powerful resource. Another is Libgen, and the List of Awesome
https://github.com/topics/awesome https://en.wikipedia.org/wiki/Library_Genesis
cll7793@lemmy.worldOPto
No Stupid Questions@lemmy.world•What are your favorite mindblowing mathematics videos?English
7·2 years agoVsauce is awesome! He makes you question everything after each video!
Also…

cll7793@lemmy.worldOPto
No Stupid Questions@lemmy.world•What are your favorite mindblowing mathematics videos?English
4·2 years agoSome of my favorites are from Edward Frenkel and the Langlands Program. In analogy the Langlands Program can be thought of as the “Theory of Everything” of mathematics linking various seemingly disconnected fields together.
Another great video is from 3b1b where he shows Pi somehow emerge from 2 blocks colliding against each other.
Links & Resources
cll7793@lemmy.worldOPto
No Stupid Questions@lemmy.world•What are the highest quality search engines?English
1·2 years agoPlease solve the partial differential equation to continue watching the video:

“I am not a robot” captcha is getting too hard…": https://www.youtube.com/watch?v=ru6fi4O4lp4
cll7793@lemmy.worldOPto
No Stupid Questions@lemmy.world•What are the highest quality search engines?English
3·2 years agoYour comment belongs higher. When given the opportunity to make money by social media advertising sometimes in the thousands or millions, companies, share holders, and conflict of interest groups take it. Cablemod’s burning adapters, cryptocurrency scams, payed positive youtube reviews are some great examples. In general there is no honor system and its best to assume anything that can be abused will.
Also manipulating votes is incredibly effective towards swaying public opinion due to the bandwagon effect. Spend a days worth of effort making fake accounts and downvoting any opinion you see as undesirable and most people will follow suit. This is especially bad in echo chambers like on twitter, reddit, etc.
I wish a broader audience could be aware of this. The best I can do is try to spread the word.
Sources:
- The Problem with Linus Tech Tips: Accuracy, Ethics, & Responsibility: https://www.youtube.com/watch?v=FGW3TPytTjc
- Cablemod Recall: https://old.reddit.com/r/cablemod/comments/18o7bnv/planned_voluntary_safety_recall_of_cablemod/
- FTX Scandal: https://en.wikipedia.org/wiki/Bankruptcy_of_FTX
- Bandwagon Effect: https://en.wikipedia.org/wiki/Bandwagon_effect
- Echo Chamber: https://en.wikipedia.org/wiki/Echo_chamber_(media)
cll7793@lemmy.worldOPto
Asklemmy@lemmy.ml•What are the highest quality search engines?English
4·2 years agoSame! And lemmy has provided the highest quality answers on the internet in my opinion.
cll7793@lemmy.worldOPto
No Stupid Questions@lemmy.world•What are the highest quality search engines?English
2·2 years agoSelf hosting is smart! Usually good things always come to an end, at least if they are not open sourced.
cll7793@lemmy.worldOPto
Asklemmy@lemmy.ml•What are the highest quality search engines?English
21·2 years agoDue to how important search is, it is not a stable solution to place the trust of the technology, your data privacy, and fair pricing to a corporation. Kagi so far seems great don’t get me wrong! But enshitification from monetary incentives almost always occur. Open source search is the only stable long term solution.



Thanks! That’s a good idea!