Shuffle file randomly with some additional constraints

I have a massive songs playlist and also, while some musicians have several cds, others have simply one track. I intended to sort the playlist so the very same musician will not play two times straight, or his tracks will not wind up primarily at first or end of the playlist.

Instance playlist:

$ cat /tmp/playlist.m3u
Anna A. - Song 1
Anna A. - Song 2
I--Rock - Song 1
John B. - Song 1
John B. - Song 2
John B. - Song 3
John B. - Song 4
John B. - Song 5
Kyle C. - Song 1
U--Rock - Song 1

Output from sort -R or shuf:

$ sort -R /tmp/playlist.m3u
Anna A. - Song 1 #
U--Rock - Song 1
Anna A. - Song 2 # Anna's songs are all in the beginning.
John B. - Song 2
I--Rock - Song 1
John B. - Song 1
Kyle C. - Song 1
John B. - Song 4 #
John B. - Song 3 #
John B. - Song 5 # Three of John's songs in a row.

What I am expecting:

$ some_command /tmp/playlist.m3u
John B. - Song 1
Anna A. - Song 1
John B. - Song 2
I--Rock - Song 1
John B. - Song 3
Kyle C. - Song 1
Anna A. - Song 2
John B. - Song 4
U--Rock - Song 1
John B. - Song 5
12
2022-07-25 20:44:28
Source Share
Answers: 1

Your example data and constraints actually only allow a few solutions—you must play John B. every other song, for example. I'm going to assume your actual full playlist isn't essentially John B, with random other stuff to break it up.

This is another random approach. Unlike @frostschutz's solution, it runs quickly. It does not guarantee a result that matches your criteria, however. I also present a second approach, which works on your example data—but I suspect will produce bad results on your real data. Having your real data (obfuscated), I add approach 3—which is a uniform random, except it avoids two songs by the same artist in a row. Note that it only makes 5 "draws" into the "deck" of remaining songs, if after that it still is faced with a duplicate artist, it'll output that song anyway—this way, its guaranteed that the program will actually finish.

Approach 1

Basically, it generates a playlist by at each point, asking "which artists do I still have unplayed songs from?" Then picking a random artist, and finally a random song from that artist. (That is, each artist is weighted equally, not in proportion to the number of songs.)

Give it a try on your actual playlist, and see if it produces better results than uniformly random.

Usage: ./script-file < input.m3u > output.m3u Make sure to chmod +x it, of course. Note it doesn't handle the signature line that is at the top of some M3U files properly... but your example didn't have that.

#!/usr/bin/perl
use warnings qw(all);
use strict;

use List::Util qw(shuffle);

# split the input playlist by artist
my %by_artist;
while (defined(my $line = <>)) {
    my $artist = ($line =~ /^(.+?) - /)
        ? $1
        : 'UNKNOWN';
    push @{$by_artist{$artist}}, $line;
}

# sort each artist's songs randomly
foreach my $l (values %by_artist) {
    @$l = shuffle @$l;
}

# pick a random artist, spit out their "last" (remeber: in random order)
# song, remove from the list. If empty, remove artist. Repeat until no
# artists left.
while (%by_artist) {
    my @a_avail = keys %by_artist;
    my $a = $a_avail[int rand @a_avail];
    my $songs = $by_artist{$a};
    print pop @$songs;
    @$songs or delete $by_artist{$a};
}

Approach 2

As a second approach, instead of pick a random artist, you can use pick the artist with the most songs, who is also not the last artist we picked. The final paragraph of the program then becomes:

# pick the artist with the most songs who isn't the last artist, spit
# out their "last" (remeber: in random order) song, remove from the
# list. If empty, remove artist. Repeat until no artists left.
my $last_a;
while (%by_artist) {
    my %counts = map { $_, scalar(@{$by_artist{$_}}) } keys %by_artist;
    my @sorted = sort { $counts{$b} <=> $counts{$a} } shuffle keys %by_artist;
    my $a = (1 == @sorted)
        ? $sorted[0]
        : (defined $last_a && $last_a eq $sorted[0])
            ? $sorted[1]
            : $sorted[0];
    $last_a = $a;
    my $songs = $by_artist{$a};
    print pop @$songs;
    @$songs or delete $by_artist{$a};
}

The rest of the program stays the same. Note that this by far isn't the most efficient way to do this, but it should be fast enough for playlists of any sane size. With your example data, all generated playlists will start with a John B. song, then an Anna A. song, then a John B. song. After that, it's much less predictable (as everyone but John B. has one song left). Note that this assumes Perl 5.7 or later.

Approach 3

Usage is the same as the previous 2. Note the 0..4 part, that's where the 5 tries max comes from. You could up the number of tries, e.g., 0..9 would give 10 total. (0..4 = 0, 1, 2, 3, 4, which you'll notice is actually 5 items).

#!/usr/bin/perl
use warnings qw(all);
use strict;

# read in playlist
my @songs = <>;

# Pick one randomly. Check if its the same artist as the previous song.
# If it is, try another random one. Try again 4 times (5 total). If its
# still the same, accept it anyway.
my $last_artist;
while (@songs) {
    my ($song_idx, $artist);
    for (0..4) {
        $song_idx = int rand @songs;
        $songs[$song_idx] =~ /^(.+?) - /;
        $artist = $1;
        last unless defined $last_artist;
        last unless defined $artist; # assume unknown are all different
        last if $last_artist ne $artist;
    }

    $last_artist = $artist;
    print splice(@songs, $song_idx, 1);
}
7
2022-07-25 21:53:02
Source