C# 之LINQ
LINQ
允许询问任何实现了IEnumerable<T>
接口的集合二,不管是数组,列表,或者XML DOM等。它带来了在编译阶段的类型检查和动态查询的双重好处。
基础
LINQ的基本单元是序列(Sequence) 和元素(Element),序列即实现了IEnumerable<T>
接口的集合,元素就是该集合中的项。
询问操作符(query operator)是改变序列的方法,典型的操作符接受输入序列,返回输出序列。在System.Linq
空间的Enumerable
类,有将近40多个询问操作符,所有这些方法都是静态扩展方法。
class Program
{
public static void Main()
{
List<int> nums = new List<int> { 1, 2, 3 };
var num = nums.Where(n => n >=2).Where(n=>n%2==0);
nums.Add(4);
foreach (var s in num) Console.WriteLine(s);
var num1 = from n in nums
where n >= 2 && n % 2 == 0
select n;
foreach (var s in num1) Console.WriteLine(s);
}
}
从上面的小例子可以看出:
- 询问可以分为两种方式:(1)
fluent syntax
,即不断的调用询问方法(2)query syntax
,即通过询问语句来进行。这两种方式是互补的。 - 只有输出序列被枚举时候(调用
MoveNext
方法)时,才去进行询问操作
Fluent syntax
Chaining Query Operators
class Program
{
public static void Main()
{
string[] names = { "Tom", "Dick", "Harry", "Mary", "Jay" };
var query = names
.Where(n => n.Contains("a"))
.OrderBy(n => n.Length)
.Select(n => n.ToUpper());
foreach (var n in query) Console.WriteLine(n);
}
}
Query Expression
编译器通过把query expression
转换为fluent syntax
,这种转换是一种比较机械的方式,就像把foreach
语句转换为调用GetEnumerator
然后调用MoveNext
一样。这意味着,任何用query syntax
可以询问的都可以用fluent syntax
写出来。
注意query expression必须以select 或group结尾。
Range Variable
紧跟from
后的变量就称为range variable
。query expression也允许你用let,into,另外的from,join
等来引入新的range variable
.
Query syntax VS Fluent syntax
两者各有优点。
下面情况用Query syntax
比较简单:
-
let
语句引入新的变量 -
SelectMany,join,GroupJoin
后面跟着outer range variable
对于包含单个操作符的,用fluent syntax
就比较简洁。最后,有很多操作符是没有对应的query syntax
,这就需要使用fluent syntax
,至少部分使用。
延迟执行
只有在被枚举的时候才被执行,除了下面的情况:
- 返回标量或者单个元素的操作符,比如
First,Count
- 转换操作,如
ToArray,ToList,ToDictionary,ToLookup
这些操作会马上执行,不会延迟,因为结果的类型没有支持延迟操作的机制。
延迟执行是非常重要的,因为它解耦了query construction
和query execution
.
Reevaluation
当query
被重新枚举的时候,它会重新的执行。
public static void Main()
{
var numbers = new List<int> { 1, 2 };
IEnumerable<int> query = numbers.Select(n => n * 10);
foreach (int n in query) Console.Write(n + " | ");// 10|20|
numbers.Clear();
foreach (int n in query) Console.Write(n + " | ");//<nothing>
}
可以通过转换操作符,来避免reevaluate
。
public static void Main()
{
var numbers = new List<int> { 1, 2 };
var query = numbers.Select(n => n * 10).ToList();//马上执行
foreach (int n in query) Console.Write(n + " | ");// 10|20|
numbers.Clear();
foreach (int n in query) Console.Write(n + " | ");
}
Captured Variables
IEnumerable<char> query = "not what you might expect";
string vowels = "aeiou";
for (int i = 0; i < vowels.Length; i++)
{
//var vo = vowels[i];
query=query.Where(c => c != vowels[i]);
}
foreach (char c in query) Console.Write(c);
IEnumerable<char> query = "not what you might expect";
string vowels = "aeiou";
for (int i = 0; i < vowels.Length-1; i++)
{
//var vo = vowels[i];
query=query.Where(c => c != vowels[i]);
}
foreach (char c in query) Console.Write(c);
}
IEnumerable<char> query = "not what you might expect";
string vowels = "aeiou";
for (int i = 0; i < vowels.Length; i++)
{
var vo = vowels[i];
query =query.Where(c => c != vo);
}
foreach (char c in query) Console.Write(c);
Subqueries
var names = new List<string>{ "Tom", "Dick", "Harry", "Jay" };
var re = names.Where(n=>n.Length==names.OrderBy(n2=>n2.Length).First().Length);
names.Add("jim");
foreach (var r in re) Console.WriteLine(r + " ");
Subqueries
对于本地集合来讲,是不高效的,因为外围的query每次都会调用subqueries,在这个例子中就是每次都对names进行排序,求最小长度。
解决方法还是本地化:
var names = new List<string>{ "Tom", "Dick", "Harry", "Jay" }; int shortest = names.Min(n => n.Length); var re = names.Where(n=>n.Length==shortest); names.Add("jim"); foreach (var r in re) Console.WriteLine(r + " ");
建立复杂询问的合成策略
- Progressive query construction :进行逐步的询问组合
- Using the
into
keyword :利用into
关键字,形成新的range variable - Wrapping queries:
into
into
仅仅能出现在select,group
后,可以认为“新开辟”了一个询问,允许引入新的where,orderby,select
等,但实际上还是一个询问。
需要注意的是,在into
后,所有range variable
也就出了它们所在的范围。
这里是非法的,因为into n2
后,后面的范围就只能是n2了,n1就跑出了自身所在的范围。
让我们更正,再次运行程序:
public static void Main() { string[] names = { "Tom", "Dick", "Harry", "Mary", "Jay", "John" }; var query = from n1 in names select n1.ToUpper() into n2 where n2.Contains("O") select n2; foreach (var s in query) Console.WriteLine(s); }
运行正确。
Wrapping Queries
包裹询问
可以将:
var tempQuery=tempQueryExpr;var finalQuery=from .... in tempQuery;
转换为:
var finalQuery=from .... in tempQueryExpr;
wrapping query
在语义上等价于逐步询问的组合,into
关键字(无中间变量)。
比如:
progressive construction
:
var query= from n in names select n.Replace("a","").Replace("e","").Replace("i","").Replace("o","") .Replace("u","");query=from n in query where n.Length>2 orderby n select n;
对应的wrapped queries
:
var query= from n1 in ( from n2 in names select n2.Replace("a","").Replace("e","").Replace("i","").Replace("o","") .Replace("u","")) where n1.Length>2 orderby n1 select n1;
wrapped queries
可能和subqueries
有点像,都有内外询问,但subqueries
是在lambda表达式中。
Projection strategies 投射策略
对象实例器
至此,所有select
都是投射了标量元素类型,除了这些,还可以投射出更复杂的类型,比如,在第一步询问中,我们希望既保留names原有的版本,又有去除元音的版本。
public class TempProjectionItem { public string Original; public string Vowelless; } class Program { public static void Main() { string[] names = { "Tom", "Dick", "Harry", "Mary", "Jay", "John" }; var temp = from n in names select new TempProjectionItem { Original = n, Vowelless = n.Replace("a", "").Replace("e", "").Replace("i", "").Replace("o", "") .Replace("u", "") };//实例的初始化语句 var query = from item in temp where item.Vowelless.Length > 2 select item.Original; foreach (var i in query) Console.WriteLine(i); }
Anonymous types匿名类型
匿名类型允许不用写特定的类来结构化中间结果,比如上面的例子,我们可以不用写TempProjectionItem
类,而用匿名类:
public static void Main() { string[] names = { "Tom", "Dick", "Harry", "Mary", "Jay", "John" }; var temp = from n in names select new { Original = n, Vowelless = n.Replace("a", "").Replace("e", "").Replace("i", "").Replace("o", "") .Replace("u", "") };//实例的初始化语句 var query = from item in temp where item.Vowelless.Length > 2 select item.Original; foreach (var i in query) Console.WriteLine(i); } }
可见,匿名类让我们不用专门写一个特定的类来存放中间结果,可以直接new实例化,并初始化。实际上,编译器替我们创建了一个特定的类,这种情况下,我们必须使用var 关键字,因为我们不知道匿名类的类型。
我们可以用into
写出整个的询问:
string[] names = { "Tom", "Dick", "Harry", "Mary", "Jay", "John" }; var query = from n in names select new { Original = n, Vowelless = n.Replace("a", "").Replace("e", "").Replace("i", "").Replace("o", "") .Replace("u", "") } into temp where temp.Vowelless.Length > 2 select temp.Original; foreach (var i in query) Console.WriteLine(i);
let keyword
let
关键字在保留了原有的range variable
时,也引入了新的变量,这和into
是不一样的,into
后就超出了原有的range variable
的作用范围。
public static void Main() { string[] names = { "Tom", "Dick", "Harry", "Mary", "Jay", "John" }; var query = from n in names let voweless= n.Replace("a", "").Replace("e", "").Replace("i", "").Replace("o", "") .Replace("u", "") where voweless.Length > 2 select n;//Thanks to let, n is stil in scope foreach (var i in query) Console.WriteLine(i); }
编译器通过创建一个匿名类,该匿名类既包含range variable也包含新的变量,也就是,转换到上一个例子中去了。
可以在where
语句前后有任意多个let
语句,let
语句可以引用它之前的任意变量。
LINQ operators
标准询问操作可以分为三类:
- Sequence in, Sequence out(序列到序列)
- Sequence in, single element or scaler value out(序列进,单个元素或标量值出)
- Nothing in, sequence out(生成模式)
Sequence in, Sequence out
Filtering
IEnumerable<TSource> 到 IEnumerable<TSource>
Filter 筛选,也就是返回原始元素的子集,运算操作有Where,Take,TakeWhile,Skip,SkipWhile,Distinct
。
- where
where bool-expression
string[] names = { "Tom", "Dick", "Harry", "Mary", "Jay" }; var query = names.Where(n => n.EndsWith("y")); foreach (var s in query) Console.WriteLine(s);
等效query语句:
string[] names = { "Tom", "Dick", "Harry", "Mary", "Jay" }; var query = from n in names where n.EndsWith("y") select n; foreach (var s in query) Console.WriteLine(s);
where也可以在query语句中出现多次:
string[] names = { "Tom", "Dick", "Harry", "Mary", "Jay" }; var query = from n in names where n.Length>3 let u=n.ToUpper() where u.EndsWith("Y") select u; foreach (var s in query) Console.WriteLine(s);
甚至:
string[] names = { "Tom", "Dick", "Harry", "Mary", "Jay" }; var query = from n in names where n.Length>3 where n.EndsWith("y") select n; foreach (var s in query) Console.WriteLine(s);
where
的predicte
可以选择性的接受第二个参数,类型是int,含义是每个元素的索引:
string[] names = { "Tom", "Dick", "Harry", "Mary", "Jay" }; var query = names.Where((n, i) => i % 2 == 0); foreach (var s in query) Console.WriteLine(s);
对应的linq expression:
string[] names = { "Tom", "Dick", "Harry", "Mary", "Jay" }; var query = from n in names where names.ToList().IndexOf(n)%2==0 select n; foreach (var s in query) Console.WriteLine(s);
注意需要先把数组转换为list,然后调用indexof方法求得其索引号。
Take 和Skip
Take
返回前n个元素,而丢弃剩余的元素,Skip
丢弃前n个元素,而返回剩余的元素。
string[] names = { "Tom", "Dick", "Harry", "Mary", "Jay" }; var query = names.Take(3); foreach (var s in query) Console.WriteLine(s);
string[] names = { "Tom", "Dick", "Harry", "Mary", "Jay" }; var query = names.Skip(3); foreach (var s in query) Console.WriteLine(s);
TakeWhile和SkipWhile
TakeWhile
是不断的先take集合中的元素,直到集合中的元素不满足一定的条件:
int[] numbers = { 3, 5, 2, 234, 4, 1 }; var takeWhileSmall = numbers.TakeWhile(n => n < 100); foreach (var i in takeWhileSmall) Console.WriteLine(i);
SkipWhile
是先不取集合中的元素,直到集合中的元素不满足一定的条件:
int[] numbers = { 3, 5, 2, 234, 4, 1 }; var skipWhileSmall = numbers.SkipWhile(n => n < 100); foreach (var i in skipWhileSmall) Console.WriteLine(i);
Distinct
Distinct
返回输入序列的去重后的序列,可以选择性的传入该方法一个定制化的equality comparer
(C#之集合 - JohnYang819 - 博客园 (cnblogs.com))。
public class Customer { public string LastName; public string FirstName; public Customer(string last, string first) { LastName = last; FirstName = first; } public override string ToString() { return FirstName + " " + LastName; } } public class LastFirstEqualityComparer : EqualityComparer<Customer> { public override bool Equals(Customer x, Customer y) => (x.LastName == y.LastName && x.FirstName == y.FirstName); public override int GetHashCode(Customer obj) => (obj.LastName + ";" + obj.FirstName).GetHashCode(); } class Program { public static void Main() { var c1 = new Customer("John", "Yang"); var c2 = new Customer("John", "Yang"); var c3 = new Customer("Tom", "Kong"); Customer[] cts = new Customer[] { c1, c2, c3 }; var c5 = cts.Distinct(); foreach (var c in c5) Console.Write(c+" "); Console.WriteLine(); var cmp = new LastFirstEqualityComparer(); var c4 = cts.Distinct(cmp); foreach (var c in c4) Console.Write(c+" "); } }
Projecting
映射
IEnumerable<TSource>到IEnumerable<TResult>
方法有Select,SelectMany
Select
用Select
,可以得到输入序列相同数量的元素,不过每个元素都已经经过lambda函数的转变。
Select
语句经常被映射为匿名类型:
var query= from f in FontFfamily.Families select new {f.Name,}
示例一
:
var nums = new int[] { 1, 2, 3, 4, 5 }; var query = from n in nums select new { num1 = n % 2, num2 = n % 3 }; foreach (var i in query) Console.WriteLine(i.num1.ToString() + " " + i.num2.ToString());
示例二
:
public class Customer { public string LastName; public string FirstName; public Customer(string last, string first) { LastName = last; FirstName = first; } public override string ToString() { return FirstName + " " + LastName; } } class Program { public static void Main() { var csts = new Customer[] { new Customer("John","Yang"), new Customer("Zank","Mofri"), new Customer("Padd","Jwee") }; var query = from c in csts select new { c.FirstName, c.LastName.Length }; foreach (var q in query) Console.WriteLine(q.FirstName + " " + q.Length.ToString()); } }
通过示例一,和示例二,可以看到匿名类型的字段的名称如果要与原来的元素的类型的名称不一样就必须明确,如示例一,而如果要与原来的元素的可用字段名称,就不必明确,如示例二,直接写,直接调用。
如果没有转变的select
纯粹是为了满足query
必须要求以select或group
语句结束的要求。
indexed projection
string[] names = { "Tom", "Dick", "Harry", "Mary", "Jay" }; var query = names.Select((s, i) => i + "=" + s); foreach (var q in query) Console.WriteLine(q);
select subqueries 和object hierarchies
先看下System.IO
下的文件路径和文件的对象的简单使用(详细的FileInfo,DirInfo用法见C#操作文件属性 - 百度文库 (baidu.com)):
public static void Main() { FileInfo[] dirs = new DirectoryInfo(@"G:\ipad电子书\C#\").GetFiles(); foreach (var d in dirs) Console.WriteLine(d); }
DirectoryInfo[] dirs = new DirectoryInfo(@"G:\ipad电子书\C#\").GetDirectories(); foreach (var d in dirs) Console.WriteLine(d);
DirectoryInfo[] dirs = new DirectoryInfo(@"G:\ipad电子书\C#\").GetDirectories(); var query = from d in dirs where (d.Attributes & FileAttributes.System) == 0 select new { DirectoryName = d.FullName, Created = d.CreationTime, Files = from f in d.GetFiles() where (f.Attributes & FileAttributes.Hidden)== 0 select new { FileName = f.Name, f.Length } }; foreach(var dirFiles in query) { Console.WriteLine("Directory: " + dirFiles.DirectoryName+" Created in "+dirFiles.Created); foreach (var file in dirFiles.Files) Console.WriteLine(" " + file.FileName + "Len:" + file.Length); } foreach(var d in dirs) { Console.WriteLine((int)d.Attributes); Console.WriteLine((int)FileAttributes.System); Console.WriteLine((d.Attributes & FileAttributes.System) == 0); }
从上面例子可以看出,可以在select
语句中内嵌一个subquery
来建立有层次的对象。
SelectMany
SelectMany
方法就是把挑选出来的子序列们“合成”一个“扁平”的序列。
先看一个Select
的示例:
string[] fullNames = { "Anne Williams", "John FFred Smith", "Sue Greeen" }; var query = fullNames.Select(name => name.Split()); foreach (var q in query) { foreach (var n in q) Console.WriteLine(n); }
在上面Select
中,形成的query
实际上是一个IEnumerable<string[]>
类型,所以要用双重循环才能取遍元素。而用SelectMany
就可以直接形成扁平化的IEnumerable<string>
类型。
string[] fullNames = { "Anne Williams", "John FFred Smith", "Sue Greeen" }; var query = fullNames.SelectMany(name => name.Split()); foreach (var q in query) { Console.WriteLine(q); }
等效的Query Syntax
(也被称为额外生成器【additional generator】):
from identifier1 in enumerable-expression1from identifier2 in enumerable-expression2....
string[] fullNames = { "Anne Williams", "John FFred Smith", "Sue Greeen" }; var query = from fullName in fullNames from name in fullName.Split() select name; foreach (var q in query) { Console.WriteLine(q); }
multiple range variable
上面例子中,name和fullName
一直都在可被使用的范围中,除非到最后,或者碰到into
语句,这点使得query syntax
在这方面上,相比fulent syntax 更有优势。而SelectMany
其实是破坏了最外层元素的结构,对于上例来说就是IEnumerable<string[]>
被直接“拉平”为IEnumerable<string>
了。
比如:
string[] fullNames = { "Anne Williams", "John FFred Smith", "Sue Greeen" }; var query = from fullName in fullNames from name in fullName.Split() select name+" came from "+fullName; foreach (var q in query) { Console.WriteLine(q); }
实际上在幕后,编译器做了很多技巧性的工作,允许我们同时访问name
和fullName
。而对于SelectMany
只能诉诸于匿名类型而保持外层元素的结构:
string[] fullNames = { "Anne Williams", "John FFred Smith", "Sue Greeen" }; var query = fullNames.SelectMany(fName=>fName.Split().Select(name=>new { name, fName })). Select(x=>x.name+" Came from "+ x.fName); foreach (var q in query) { Console.WriteLine(q); }
当写这种“额外生成器”时,有两种基本模式:(1)延申及扁平化子序列(上例);(2)笛卡尔积或称交叉积;其中第一种模式两个变量是递进的关系;第二种模式的两个变量是平级的关系.
- 笛卡尔积例子
int[] nums = { 1, 2, 3 }; string[] letters = { "a", "b" }; var query = from n in nums from l in letters select n.ToString() + l; foreach (var r in query) Console.WriteLine(r);
string[] players = { "Tom", "Jay", "Mary" }; var query = from p1 in players from p2 in players where p1.CompareTo(p2)<0 select p1+" VS "+p2; foreach (var r in query) Console.WriteLine(r);
改变过滤条件:
string[] players = { "Tom", "Jay", "Mary" }; var query = from p1 in players from p2 in players where p1.CompareTo(p2)>0 select p1+" VS "+p2; foreach (var r in query) Console.WriteLine(r);
Joining 联结
Query Syntax
:
from outer-var in outer-enumerablejoin inner-var in inner-enumerable on outer-key-expr equals inner-key-expr [into identifier]
join
和GroupJoin
将两个输入序列合成一个单独的序列,Join
返回的是flat output
,GroupJoin
返回的是hierarchical output
.
Join
和GroupJoin
对于本地 in-memory
集合的优势在于更高效,原因在于首先将inner sequence加载,避免了不断重复的枚举。缺点是仅仅提供了inner,left outer joins.
class Program { public class Student { public int StID; public string LastName; } public class CourseStudent { public string CourseName; public int StID; } static Student[] students=new Student[] { new Student{StID=1,LastName="Carson"}, new Student{StID=2,LastName="Klassen"}, new Student{StID=3,LastName="Fleming"}, }; static CourseStudent[] studentsInCourses = new CourseStudent[] { new CourseStudent{CourseName="Art",StID=1}, new CourseStudent{CourseName="Art",StID=2}, new CourseStudent{CourseName="History",StID=1}, new CourseStudent{CourseName="History",StID=3}, new CourseStudent{CourseName="Physics",StID=3}, }; public static void Main() { var query = from s in students join c in studentsInCourses on s.StID equals c.StID where c.CourseName == "History" select s.LastName; foreach (var q in query) Console.WriteLine(q); } }
方法一
:
var query = from s in students join c in studentsInCourses on s.StID equals c.StID select s.LastName+" select Course "+c.CourseName; foreach (var q in query) Console.WriteLine(q);
方法二
:
var query = from s in students from c in studentsInCourses where s.StID == c.StID select s.LastName+" select Course "+c.CourseName; foreach (var q in query) Console.WriteLine(q);
方法一就是join方法,方法二是“等效”的from subsequence方法,方法一比较高效,另外join可以使用多次。
Joining on multiple keys
from x in sequenceXjoin y in sequenceY on new{K1=x.Prop1,k2=x.Prop2} equals new{K1=y.Prop3,K2=y.Prop4}....
Joining in fluent syntax
from c in customersjoin p in purchases on c.ID equals p.CustomerIDselect new {c.Name,P.Description,p.Price}
的等效fluent syntax
是:
customers.Join(purchases, c=>c.ID, p=>p.CustomerID, (c,p)=>new{c.Name,p.Description,p.Price});
GroupJoin
GroupJoin
与Join
的query syntax基本一样,除了它必须后面加上into
语句。
与仅能跟在Select
和Group
后面的into
不一样,跟在Join
后面的into
语句表示GroupJoin
public static void Main() { var query = from s in students join c in studentsInCourses on s.StID equals c.StID into courses select new { s.LastName, courses }; foreach (var q in query) { Console.WriteLine(q.LastName+":"); foreach (var p in q.courses) Console.Write(p.CourseName+","); Console.WriteLine(); } }
默认地,GroupJoin
表现的是left outer join
行为,即对outer sequence全部包含,对inner sequence不保证完全包含,
如果想要得到inner join
,即inner sequence是空的则被排除在外,就需要对inner sequence进行过滤了,
关于left outer join,right outer join,inner join
可参考(9条消息) SQL 内连接(inner join)与外连接(left outer join 、right outer join )区别_Shirley的博客-CSDN博客_left outer
对于本例来讲,我们现在增加一个新的student
,再进行GroupJoin
:
class Program
{
public class Student
{
public int StID;
public string LastName;
}
public class CourseStudent
{
public string CourseName;
public int StID;
}
static Student[] students=new Student[]
{
new Student{StID=1,LastName="Carson"},
new Student{StID=2,LastName="Klassen"},
new Student{StID=3,LastName="Fleming"},
new Student{StID=4,LastName="JohnYang"}
};
static CourseStudent[] studentsInCourses = new CourseStudent[]
{
new CourseStudent{CourseName="Art",StID=1},
new CourseStudent{CourseName="Art",StID=2},
new CourseStudent{CourseName="History",StID=1},
new CourseStudent{CourseName="History",StID=3},
new CourseStudent{CourseName="Physics",StID=3},
};
public static void Main()
{
var query = from s in students
join c in studentsInCourses on s.StID equals c.StID
into courses
select new { s.LastName, courses };
foreach (var q in query)
{
Console.WriteLine(q.LastName+":");
foreach (var p in q.courses)
Console.Write(p.CourseName+",");
Console.WriteLine();
}
}
}
发现没有Course
的JohnYang
也被选了出来,这也证实了GroupJoin
的确是默认left outter join
.
现在,我们进行过滤:
var query = from s in students
join c in studentsInCourses on s.StID equals c.StID
into courses
where courses.Any()
select new { s.LastName, courses };
foreach (var q in query)
{
Console.WriteLine(q.LastName+":");
foreach (var p in q.courses)
Console.Write(p.CourseName+",");
Console.WriteLine();
}
发现没有Course
的JohnYang
已经被排除在外,也证明了我们已经实现了inner join.
需要注意的是,对于GroupJoin
,在into
后面的语句实际上是针对的已经联结过后的inner sequence,如果要对原有的单独的作为inner sequence的元素进行过滤,则需要在join前进行过滤,比如:
from c in Customers
join p in puchases.Where(p2=>p2.price>1000)
on c.ID equals p.CustomerID
into cusPurchases
Zip
int[] numbers = { 3, 5, 7 }; string[] words = { "Three", "Five", "Seven" }; var zip = numbers.Zip(words); foreach(var z in zip) { Console.WriteLine(z.First.ToString() + " is " + z.Second); }
int[] numbers = { 3, 5, 7 };
string[] words = { "Three", "Five", "Seven" };
var zip = numbers.Zip(words,(n,w)=>n.ToString()+" is "+w);
foreach(var z in zip)
{
Console.WriteLine(z);
}
Grouping
query syntax
group element-expression by key-expression
groupby
可以把输入序列组成为一组的序列,该序列带有Key
属性,该属性也是通过GroupBy
方法得到的。
string[] files = Directory.GetFiles(@"C:\Users\PC\Downloads\");
var query = files.GroupBy(file => Path.GetExtension(file));
foreach(IGrouping<string,string> grouping in query)
{
Console.WriteLine("Extension:" + grouping.Key);
foreach (string filename in grouping)
Console.WriteLine(" -" + filename);
}
等效的Query syntax
:
string[] files = Directory.GetFiles(@"C:\Users\PC\Downloads\");
//var query = files.GroupBy(file => Path.GetExtension(file));
var query = from f in files
group f by Path.GetExtension(f);
foreach(IGrouping<string,string> grouping in query)
{
Console.WriteLine("Extension:" + grouping.Key);
foreach (string filename in grouping)
Console.WriteLine(" -" + filename);
}
默认地,groupby
仅仅是对原有的元素进行了分组,而并没有改变元素本身,但这并不意味着不能改变,可以传入第二个参数elementSelector
来做到这点,或者直接在Query Syntax
中进行转换:
string[] files = Directory.GetFiles(@"C:\Users\PC\Downloads\");
//var query = files.GroupBy(file => Path.GetExtension(file));
var query = from f in files
group f.ToUpper() by Path.GetExtension(f);
foreach(IGrouping<string,string> grouping in query)
{
Console.WriteLine("Extension:" + grouping.Key);
foreach (string filename in grouping)
Console.WriteLine(" -" + filename);
}
string[] files = Directory.GetFiles(@"C:\Users\PC\Downloads\");
var query = files.GroupBy(file => Path.GetExtension(file),file=>file.ToUpper());
//var query = from f in files
// group f.ToUpper() by Path.GetExtension(f);
foreach (IGrouping<string,string> grouping in query)
{
Console.WriteLine("Extension:" + grouping.Key);
foreach (string filename in grouping)
Console.WriteLine(" -" + filename);
}
Grouping by multiple keys
通过匿名类型,可以通过复杂的key值来进行分组:
string[] names = { "Tom", "Jason", "John", "Peter", "Joee", "Lucy" };
var query = from n in names
group n by new { fl = n[0], len = n.Length };
foreach (var grouping in query)
{
Console.WriteLine(grouping.Key.fl + " " + grouping.Key.len.ToString());
foreach (var g in grouping)
Console.WriteLine(g);
}
Set operators
Concat and Union
int[] seq1 = { 1, 2, 3 }, seq2 = { 3, 4, 5 };
var concat = seq1.Concat(seq2);
var union = seq1.Union(seq2);
foreach (var s in concat) Console.Write(s + " ");
Console.WriteLine();
foreach (var s in union) Console.Write(s + " ");
MethodInfo[] methods = typeof(string).GetMethods();
PropertyInfo[] props = typeof(string).GetProperties();
var both = methods.Union<MemberInfo>(props);
foreach (var b in both) Console.WriteLine(b.Name);
Intersect and Except
int[] seq1 = { 1, 2, 3 }, seq2 = { 3, 4, 5 };
var comm = seq1.Intersect(seq2);
var dif1 = seq1.Except(seq2);
var dif2 = seq2.Except(seq1);
foreach (var s in comm) Console.Write(s + " ");
Console.WriteLine();
foreach (var s in dif1) Console.Write(s + " ");
Console.WriteLine();
foreach (var s in dif2) Console.Write(s + " ");
Conversion Method
-
OfType
和Cast
OfType
和Cast
接受非泛型的IEnumerable
集合,返回泛型IEnumerable<T>
。
var classicList = new ArrayList();
classicList.Add("string");
classicList.Add(23);
foreach (var s in classicList) Console.WriteLine(s);
ArrayList
接受object类型的元素,为非泛型可变数组,用ArrayList
来学习OfType
和Cast
。
Cast
和OfType
当遇到不兼容的类型的行为不同,OfType
会忽略这些不兼容的类型,而Cast
则会抛出异常。
OfType
:
var classicList = new ArrayList();
classicList.Add("string");
classicList.Add(23);
classicList.Add(34);
classicList.Add("string2");
var seq = classicList.OfType<int>();
foreach (var s in seq) Console.WriteLine(s);
Console.WriteLine("***************");
var seq1 = classicList.OfType<string>();
foreach (var s in seq1) Console.WriteLine(s);
Cast
:
var classicList = new ArrayList();
classicList.Add("string");
classicList.Add(23);
classicList.Add(34);
classicList.Add("string2");
var seq = classicList.Cast<int>();
foreach (var s in seq) Console.WriteLine(s);
Console.WriteLine("***************");
var seq1 = classicList.Cast<string>();
foreach (var s in seq1) Console.WriteLine(s);
需要注意的是Cast
和OfType
是检验每个元素是否能转换为目标类型(用is来判断),然后采取不同的行为。
var classicList = new ArrayList();
classicList.Add("string");
classicList.Add(23.3);
classicList.Add(34.4);
classicList.Add("string2");
var seq = classicList.OfType<int>();
foreach (var s in seq) Console.WriteLine(s);
Console.WriteLine("***************");
var seq1 = classicList.OfType<string>();
foreach (var s in seq1) Console.WriteLine(s);
Console.WriteLine(23.2 is int);
解决方法
:
int[] integers = { 1, 2, 3 };
var castLong = integers.Select(s => (long)s);
foreach (var s in castLong) Console.WriteLine(s);
ToArray,ToList,ToDictionary,ToLookup
ToArray,ToList,ToDictionary,ToLookup
强制把枚举值作为输出序列马上输出。
int[] integers = { 1, 2, 3 };
//var castLong = integers.Select(s => (long)s);
var castLong = integers.ToList();
integers[0] = 100;
foreach (var s in castLong) Console.WriteLine(s);
ToDictionary
:
int[] integers = { 1, 2, 3 };
//var castLong = integers.Select(s => (long)s);
var castLong = integers.ToDictionary(s=>s*2);
integers[0] = 100;
foreach (var s in castLong)
Console.WriteLine(s.Key.ToString() + ":" + s.Value.ToString());
int[] integers = { 1, 2, 3 };
//var castLong = integers.Select(s => (long)s);
var castLong = integers.ToDictionary(s=>s%2);//键重复,重新添加,则报错
integers[0] = 100;
foreach (var s in castLong)
Console.WriteLine(s.Key.ToString() + ":" + s.Value.ToString());
ToLookup
int[] integers = { 1, 2, 3 };
//var castLong = integers.Select(s => (long)s);
var castLong = integers.ToLookup(s=>s%2);
integers[0] = 100;
foreach (var s in castLong)
{
Console.WriteLine("key:" + s.Key.ToString());
foreach (var p in s)
Console.Write(p + " ");
Console.WriteLine();
}
Element operators
主要方法有First, FirstOrDefault, Last, LastOrDefault, Single, SingleOrDefault, ElementAt, ElementAtOrDefault, DefaultIfEmpty
以OrDefault
结尾的都是返回源类型的默认值,而非抛出异常,如果输入序列为空,或没有元素可以匹配。
First,Last, Single
int[] numbers= { 1, 2, 3,4,5 };
int first = numbers.First();//1
int last = numbers.Last();//5
int firstEven = numbers.First(n => n % 2 == 0);//2
int lastEven = numbers.Last(n => n % 2 == 0);//4
//int firstBigError = numbers.First(n => n > 10);//Exception
int firstBigNumber = numbers.FirstOrDefault(n => n > 10);//0
int onlyDivBy3 = numbers.Single(n => n % 3 == 0);//3
//int divBy2Err = numbers.Single(n => n % 2 == 0);//Error
//int singleError = numbers.Single(n => n > 10);//Error
int noMatches = numbers.SingleOrDefault(n => n > 10);//0
//int divBy2Error = numbers.SingleOrDefault(n => n % 2 == 0);//Error
var total = new int[] {first,last,firstEven,lastEven,firstBigNumber,
onlyDivBy3,noMatches};
foreach (var t in total) Console.WriteLine(t);
Single
是“最挑剔”的方法,必须是仅有一个匹配的元素,多了少了都报错,而SingleOrDefault
则在此基础上,对于没有匹配的,返回一个默认值,而对于有多个的还是抛出错误。
ElementAt
int[] numbers = { 1, 2, 3, 4, 5 };
int third = numbers.ElementAt(2);//3
//int ten=numbers.ElementAt(9);//Exception
int ten = numbers.ElementAtOrDefault(9);//0
var tot = new int[] { third, ten };
foreach (var t in tot) Console.WriteLine(t);
Aggregation Methods
聚合
IEnumerable<TSource>到Scalar
Count
int digitCount = "Pa55wo0rd1".Count(c => char.IsDigit(c));
Console.WriteLine(digitCount);
LongCount
也是相同的作用,只不过返回64位整数,允许超过20亿个元素。
-
Min
,Max
int[] numbers = { 28, 32, 14 };
int max = numbers.Max(n => n % 10);//8
int min = numbers.Min(n => n % 10);//2
var m = new int[] { max, min };
foreach (var mm in m) Console.WriteLine(mm);
-
Sum
,Average
Sum 和Average
对于它们的类型是相当严格的,比如
int avg=new int[]{3,4}.Average();//不能编译
Quantifiers
bool hasTr = new int[] { 2, 3, 4 }.Contains(3);//true
bool hasAt = new int[] { 2, 3, 4 }.Any(n => n == 3);//true
bool hasABig = new int[] { 2, 3, 4 }.Any(n => n > 10);//false
SequenceEqual
比较两个序列,如果每个元素都一样,则返回true
var a = new int[] { 1, 2, 3 };
var b = new int[] { 1, 2, 3 };
var c = new List<int> { 1, 2, 3 };
var d = new List<int> { 1, 2, 3 };
Console.WriteLine(c.SequenceEqual(b));
Console.WriteLine(a.SequenceEqual(b));
IStructuralEquatable e = (IStructuralEquatable)b;
Console.WriteLine(e.Equals(a,EqualityComparer<int>.Default));
Console.WriteLine(d.Equals(a));
IStructuralEquatable f = (IStructuralEquatable)c;
Console.WriteLine(f.Equals(d,EqualityComparer<int>.Default));
Generation Methods
Empty,Repeat,Range
是Enumerable
的静态方法。
- Empty
int[][] numbers =
{
new int[]{1,2,3},
new int[]{4,5,6},
null,
};
var flat = numbers.SelectMany(inner => inner);
foreach (var s in flat) Console.WriteLine(s);
可见,null使最后抛出个异常。
使用Empty
加上??
就可以解决这个问题。
int[][] numbers =
{
new int[]{1,2,3},
new int[]{4,5,6},
null,
};
var flat = numbers.SelectMany(inner => inner??Enumerable.Empty<int>());
foreach (var s in flat) Console.WriteLine(s);
Range 和Repeat
foreach (int i in Enumerable.Range(5, 3)) Console.Write(i.ToString() + ",");
Console.WriteLine();
foreach (bool x in Enumerable.Repeat(true, 4)) Console.Write(x + ",");