this library splits the given dataset into train and test data on account of the split size given by the user