[docs]defintersect(*x:Sequence,duplicate_method:DUPLICATE_METHOD="first")->list:""" Identify the intersection of values in multiple sequences, while preserving the order of values in the first sequence. Args: x: Zero, one or more sequences of interest containing hashable values. We ignore missing values as defined by :py:meth:`~biocutils.is_missing_scalar.is_missing_scalar`. duplicate_method: Whether to keep the first or last occurrence of duplicated values when preserving order in the first sequence. Returns: Intersection of values across all ``x``. """nargs=len(x)ifnargs==0:return[]first=x[0]ifnargs==1:# Special handling of n == 1, for efficiency.present=set()output=[]defhandler(f):ifnotis_missing_scalar(f)andfnotinpresent:output.append(f)present.add(f)ifduplicate_method=="first":forfinfirst:handler(f)else:forfinreversed(first):handler(f)output.reverse()returnoutput# The 'occurrences' dict contains the count and the index of the last# sequence that incremented the count. The intersection consists of all# values where the count == number of sequences. We need to store the index# of the last sequence so as to avoid adding a duplicate value twice from a# single sequence.occurrences={}forfinfirst:ifnotis_missing_scalar(f)andfnotinoccurrences:occurrences[f]=[1,0]foriinrange(1,nargs):forfinx[i]:ifnotis_missing_scalar(f)andfinoccurrences:state=occurrences[f]ifstate[1]<i:state[0]+=1state[1]=i# Going through the first vector again to preserve order.output=[]defhandler(f):ifnotis_missing_scalar(f)andfinoccurrences:state=occurrences[f]ifstate[0]==nargsandstate[1]>=0:output.append(f)state[1]=-1# avoid duplicatesifduplicate_method=="first":forfinfirst:handler(f)else:forfinreversed(first):handler(f)output.reverse()returnoutput