rutrum

joined 2 years ago
[–] rutrum@lm.paradisus.day 1 points 8 months ago

Thats a great solution...Im not interested in learning a new slicer right now!

[–] rutrum@lm.paradisus.day 1 points 8 months ago (1 children)

First off, understanding the different data structure from a high level is mandatory. I would understand the difference between a dataframe, series, and index are. Further, learn how numpy's ndarrays play a role.

From there, unfortunately, I had to learn by doing...or rather struggling. It was one question at a time to stack overflow, like "how to filter on a column in pandas". Maybe in the modern era of LLMs, this part might be easier. And eventually, I learned some patterns and internalized the data structures.

[–] rutrum@lm.paradisus.day 1 points 8 months ago

You are correct. For some data sources like parquet it includes some metadata that helps with this, but it's not as robust at databases I dont think. And of course, cvs have no metadata (I guess a header row.)

The actually specification for how to efficiently store tabular data in memory that also permits quick execution of filtering, pivoting, i.e. all the transformations you need...is called apache arrow. It is the backend of polars and is also a non-default backend of pandas. The complexity of the format I'm unfamiliar with.

[–] rutrum@lm.paradisus.day 1 points 8 months ago (3 children)

I learned SQL before pandas. It's still tabular data, but the mechanisms to mutate/modify/filter the data are different methodologies. It took a long time to get comfy with pandas. It wasnt until I understood that the way you interact with a database table and a dataframe are very different, that I started to finally get a grasp on pandas.

[–] rutrum@lm.paradisus.day 0 points 8 months ago

If it works, don't fix it!

[–] rutrum@lm.paradisus.day 6 points 8 months ago (2 children)

A big feature of polars is only loading applicable data from disk. But during exporatory data analysis (EDA) you often have the whole dataset in memory. In this case, filters wont help much there. Polars has a good page in their docs about all the possible optimizations it is capable of. https://docs.pola.rs/user-guide/lazy/optimizations/

One I see off the top is projection pushdown, which only selects relevant columns for a final transformations. In pandas, if you perform a group by with aggregation, then only look at a few columns, you still perform aggregation across all the data. In polars lazy API, you would define the entire process upfront, and it would know not to aggregate certain columns, for instance.

[–] rutrum@lm.paradisus.day 14 points 8 months ago* (last edited 8 months ago)

Imo Rust already has the perfect book. I would make a resource for C developers. Especially since you know C already.

[–] rutrum@lm.paradisus.day 5 points 8 months ago (4 children)

Its a paradigm shift from pandas. In polars, you define a pipeline, or a set of instructions, to perform on a dataframe, and only execute them all at once at the end of your transformation. In other words, its lazy. Pandas is eager, which every part of the transformation happens sequentially and in isolation. Polars also has an eager API, but you likely want to use the lazy API in a production script.

Because its lazy, Polars performs query optimization, like a database does with a SQL query. At the end of the day, if you're using polars for data engineering or in a pipeline, it'll likely work much faster and more memory efficient. Polars also executes operations in parallel, as well.

[–] rutrum@lm.paradisus.day 10 points 8 months ago (1 children)

How do you use Godot for data science?

[–] rutrum@lm.paradisus.day 3 points 8 months ago

logicservers is what I use. Heard it from a Xisumavoid video. It's fine. But they use a super outdated version of a proprietary web ui to manage the minecraft server specifically. But it does work and they provide the files over FTP for editing configs, uploading mods, making backups etc.

[–] rutrum@lm.paradisus.day 4 points 9 months ago* (last edited 9 months ago)

~/repo for code I write and ~/src for code I didnt.

[–] rutrum@lm.paradisus.day 7 points 9 months ago

I used Zola for a while, but at the end of the day there wasnt enough themes available that fit what I was looking for. I ended up messing with the templating engine to get what I needed.

I suggest OP choose Hugo over Zola, in the hopes that they find a theme that suits them best and for the most part prevents them from having to touch templating to begin with.

 

You know, ZFS, ButterFS (btrfs...its actually "better" right?), and I'm sure more.

I think I have ext4 on my home computer I installed ubuntu on 5 years ago. How does the choice of file system play a role? Is that old hat now? Surely something like ext4 has its place.

I see a lot of talk around filesystems but Ive never found a great resource that distiguishes them at a level that assumes I dont know much. Can anyone give some insight on how file systems work and why these new filesystems, that appear to be highlights and selling points in most distros, are better than older ones?

Edit: and since we are talking about filesystems, it might be nice to describe or mention how concepts like RAID or LUKS are related.

 

Came across a new nix wiki attempt. The announcement post is made on discourse with high skepticism.

But I really like it for two reasons:

  • For now, its incredibly informal and the barrier to entry is low. And because I can make edits directly in the web interface, it felt easy to contribute.
  • The creator mentions wanting this to be like the Arch wiki. In other words, contain information useful to nix users, but not necessarily nix specifically.

I was able to contribute a new article about distrobox, a tool I discovered and made a post about here a month or so ago.

Maybe we don't "need" another wiki, but the opportunity to contribute really made this one stand out to me. In case you all might want to contribute or learn something, I thought I would share.

 

I'm conflicted on what should handle my login manager, desktop environment, and window manager. What are the pros and cons of doing it from a nixos configurations versus a home manager configuration?

 

I made a post a while ago asking what you do when NixOS isn't cutting it. You need a package that isn't available as a flatpak/appimage or already in nixpkgs. You don't want to build from source, because it's either too difficult or too time consuming. One suggestion was containerization or virtual machines, but those seemed too cumbersome. Well, distrobox is the tool that fixes it.

Distrobox is a shell script that wraps over docker/podman to run a container of a distribution of your choice. But it does it behind a very high level API, and integrates the container environment seemlessly with your host environment. It is seriously as easy as this, if you need to install something with apt inside debian.

$ distrobox create -n my_debian --image debian:latest
$ distrobox enter my_debian

And bang, your in a debian container and it won't even feel like it. It automatically integrates your shell environment and maps your root directory inside the container (or something like that.) You seriously wouldn't know unless you neofetch. Best part is that since everything is in the nix store, every program in your environment should work, for the most part, inside this container. I've not noticed problems yet.

Tada! apt is available in this environment and you can install what you need. Then you can run it while inside the container. From the host machine, outside the container, you can run it directly too. Say you installed program X in debian:

$ distrobox enter my_debian -- X

And it will just run the command and send you back to the host machine.

In the case of docker, you can type docker ps and it will show you your debian image my_debian listed.

There's two more things I want to do to really polish this workflow. The first is to change my shell prompt so I know that I'm actually in debian without typing neofetch! Inside the box the variable CONTAINER_ID is set and the hostname is modified. I've adjusted my starship prompt to look like this when inside the box:

distrobox:my_debian ~ $

And lastly, I really want to blur the lines. If I install X in debian, I want to just call it directly from the host as X, not invoke my debian instance with distrobox enter.

When you type X and the program is missing, bash (and fish and zsh I'm sure) runs a hook that you can look at by typing

$ declare -p -f command_not_found_handle

By overriding this, you could first have it try the inside container if it can't find the application in the host container, like so.

command_not_found_handle () {
  distrobox enter my_debian -- $@
}

This is not a perfect solution, but I'm still experimenting with how to integrate this both seamlessly and also not accidentally run things inside debian and not realize it. If you have suggestions for how to improve handling calling commands from the outside environment, please share. Best case might just be adding aliases for programs explicitly. For example, `alias X=distrobox enter my_debian -- X.

Anyway, distrobox is the solution! This is one more barrier removed that was preventing me from moving my main computer over to NixOS. I'm so happy to have found this and wanted to share.

 

Dust is a rewrite of du (in rust obviously) that visualizes your directory tree and what percentage each file takes up. But it only prints as many files fit in your terminal height, so you see only the largest files. It's been a better experience that du, which isn't always easy to navigate to find big files (or atleast I'm not good at it.)

Anyway, found a log file at .local/state/nvim/log that was 70gb. I deleted it. Hope it doesn't bite me. Been pushing around 95% of disk space for a while so this was a huge win 👍

 

I came across privacy.com, a service that generates virtual credit cards, like aliases for your real credit card that can be paused or discarded at any moment.

My own credit card company has this feature. But it requires a browser plugin that so obviously is there to track my spending habits, so I've not wanted to consider it. Privacy.com looks like a great alternative.

But is it even worth it? It may be a hastle, but I can also cancel my actual credit card at any moment and they will send me a new number immediately and a card a few days later. From a privacy prospective, how much can a company use my credit card credentials to track me? Maybe a third-party virtual card provider even masks my own purchases so not even my credit card company knows? Not sure about that one.

Please share if you use one, who its with, and if its worth it.

 

I've been spending a couple weeks unable to modify my system, because using my window manager was ungodly slow (like 1fps.) Luckily NixOS lets you pick a previous generation to load so I could make changes, build a new generation, and try again.

It took me too long to find, but I realized I had both the x session managed by both nixos and home manager. Removing this fixed the problem. I assume this had 2 xsessions open and they were competing for resources or something. Be cautious! :)

 

This is the 800ml server from Hario. I make 600g water / 30-35g coffee in it every morning.

I drink my coffee slowly, and really like it hot. When I made a single 300g cup of coffee, I'd time my consumption wrong and it would be lukewarm before I finished. I didnt necessarily mind this, but now that I've been using this server I get hot coffee on demand, very conveniently.

I downsized my regular mug for a teacup, so I always get just enough hot coffee to sip and enjoy before it loses too much temp. So now I drink a lot of small teacups worth instead of a regular mug. I recommend you try this style of serving coffee and see if its for you.

Bonus: this has been so helpful when making for multiple people, since I dont always know when others wake up or come downstairs. Since its a huge insulated server I never worry about not being able to serve my roommates hot coffee.

 

Every now and then I see a program that doesn't have a default.nix or flake.nix in the source, doesn't have an entry in nixpkgs, and otherwise can't find a derivation for. So I write them myself.

What's the best way to share these? Should I contribute to nixpkgs? (does this count if I'm making flakes?) Do I maintain a single repo for each program? Or do I create a repo with a collection of flakes? Something else?

 

Hi all, I've been getting into nix lately (I've been posting here frequently) and wanted to know what projects everyone is working on. Are you trying to integrate nix into an existing project? Contribute to nixpkgs? Experiment with your configs?

 

Short video from Vimjoyer on how to setup a firefox install using home manager and flakes. In particular, the focus of this video was how to use an external flake as a source for firefox extensions, since they aren't available in nixpkgs.

 

I came across this article when wondering how to integrate the "building" aspect of nix (that is, not just a devshell) with static websites or other projects that involve some output that is not an executable.

This article also talks about adding inputs from GitHub that aren't necessarily flakes. I've used this myself to pull some example configurations for certain programs that I haven't felt like tinkering with myself yet.

view more: ‹ prev next ›