How to split a Python list of strings based on condition?

In this short tutorial we will learn how to divide a Python list objects to multiple sublists.

Example list object

All examples in this tutorial will be based on the following programming languages list :

all_lang_lst = ['R', 'Julia', 'Python', 'Haskell', 'Java', 'Javascript']

Divide a list based with delimiter

In this simple case, the after being split, the elements will be written to a string delimited by commas followed by a space:

', '.join (all_lang_lst)

The output will be:

'R, Julia, Python, Haskell, Java, Javascript'

Split a list by condition

We can divide our lists to different sub lists using a simple condition. One method that we can use for this is list comprehensions:

data_lang = ['R', 'Julia', 'Python']
other_lang = ['Java', 'Javascript', 'Haskell']
data_lst = []
other_lst = []

# divide the list values using comprehensions
data_lst = [x for x in all_lang_lst if x in data_lang]
other_lst = [x for x in all_lang_lst if x in other_lang]

#print the lists
print ("The data science related languages are:" + str(data_lst))
print ("The other languages are: " + str(other_lst))

Here’s the output:

The data science related languages are:['R', 'Julia', 'Python']
The other languages are: ['Haskell', 'Java', 'Javascript']

Note: as shown above we can use the join() function to convert any list to a string.

Divide a list to chunks

Next use case is to slice a Python list to different chunks. This technique could help if you need to process a very large list. and would prefer to segment the list and process each of its part in parallel.

Here’s a short snippet you can use:

# define the list chunk size
chunk_size = 2 

#create the sublists
sub_lsts = [all_lang_lst [i: i+chunk_size] for i in range(0, len(all_lang_lst ), chunk_size )] 

print(sub_lsts)

Here’s your output – several lists consisting of 2 elements each.

[['R', 'Julia'], ['Python', 'Java'], ['Javascript', 'Haskell']]

Split Python lists by substring

In a similar fashion we can divide the list based on whether its element contains a specific sub string. We’ll start by defining a function:

def splt_by_substrng (my_lst, my_str):
    subs_lst = []
    subs_lst_other =[]
    for e in my_lst:
        if my_str in e:
            subs_lst.append(e)
        else:
            subs_lst_other.append(e)
    return subs_lst, subs_lst_other

We’ll now call the function that we have just defined. In our case we would like to split programming languages containing the substring ‘ja’ into a separated list. The function will return two lists.

print("Here are the split lists: " + str(splt_by_substrng(all_lang_lst , "Ja")))

The result will be:

Here are the split lists: (['Java', 'Javascript'], ['R', 'Julia', 'Python', 'Haskell'])

Shuffling a Python list

Another relevant use case is to randomly order the elements of a list. in order to do that, we use the Python random library and specifically the shuffle() method. Here’s a snippet

import random
print ("The original list: " + str(new_lst))
random.shuffle(new_lst)
print("The randomized list: "+ str(new_lst))

Here’s the result:

The original list: ['Haskell', 'Julia', 'Java', 'Javascript', 'R', 'Python']
The randomized list: ['Python', 'R', 'Julia', 'Java', 'Haskell', 'Javascript']

Find the difference between two lists

Another way to split lists is simply by subtracting. We can convert our lists to Python sets and simply subtract them:

other_lst= set(all_lang_lst)- set(data_lang)