1. 程式人生 > >學習如何避免10種最常見的C#誤區

學習如何避免10種最常見的C#誤區

About C#

C# is one of several languages that target the Microsoft Common Language Runtime (CLR). Languages that target the CLR benefit from features such as cross-language integration and exception handling, enhanced security, a simplified model for component interaction, and debugging and profiling services.  Of today’s CLR languages, C# is the most widely used for complex, professional development projects that target the Windows desktop, mobile, or server environments.

C# is an object oriented, strongly-typed language. The strict type checking in C#, both at compile and run times, results in the majority of typical programming errors being reported as early as possible, and their locations pinpointed quite accurately. This can save the C# programmer a lot of time, compared to tracking down the cause of puzzling errors which can occur long after the offending operation takes place in languages which are more liberal with their enforcement of type safety.  However, a lot of programmers unwittingly (or carelessly) throw away the benefits of this detection, which leads to some of the issues discussed in this C# tutorial.

About this Tutorial

This tutorial describes 10 of the most common programming mistakes made, or problems to be avoided, by C# programmers and provide them with help.

While most of the mistakes discussed in this article are C# specific, some are also relevant to other languages that target the CLR or make use of the 

Framework Class Library (FCL).

This C# programmer is banging his head against the wall trying to fix his C# programming problems.

Common Mistake #1: Using a reference like a value or vice versa

Programmers of C++, and many other languages, are accustomed to being in control of whether the values they assign to variables are simply values or are references to existing objects. In C#, however, that decision is made by the programmer who wrote the object, not by the programmer who instantiates the object and assigns it to a variable.  This is a common “gotcha” for newbie C# programmers.

If you don’t know whether the object you’re using is a value type or reference type, you could run into some surprises. For example:

  Point point1 = new Point(20, 30);
  Point point2 = point1;
  point2.X = 50;
  Console.WriteLine(point1.X);       // 20 (does this surprise you?)
  Console.WriteLine(point2.X);       // 50
  
  Pen pen1 = new Pen(Color.Black);
  Pen pen2 = pen1;
  pen2.Color = Color.Blue;
  Console.WriteLine(pen1.Color);     // Blue (or does this surprise you?)
  Console.WriteLine(pen2.Color);     // Blue

As you can see, both the Point and Pen objects were created the exact same way, but the value of point1 remained unchanged when a new X coordinate value was assigned to point2, whereas the value of pen1 was modified when a new color was assigned to pen2. We can therefore deduce that point1 and point2 each contain their own copy of a Point object, whereas pen1 and pen2 contain references to the same Pen object.But how can we know that without doing this experiment?

The answer is to look at the definitions of the object types (which you can easily do in Visual Studio by placing your cursor over the name of the object type and pressing F12):

  public struct Point { … }     // defines a “value” type
  public class Pen { … }        // defines a “reference” type

As shown above, in C#, the struct keyword is used to define a value type, while the class keyword is used to define a reference type. For those with a C++ background, who were lulled into a false sense of security by the many similarities between C++ and C# keywords, this behavior likely comes as a surprise that may have you asking for help from a C# tutorial.

If you’re going to depend on some behavior which differs between value and reference types – such as the ability to pass an object as a method parameter and have that method change the state of the object – make sure that you’re dealing with the correct type of object to avoid C# problems.

Common Mistake #2: Misunderstanding default values for uninitialized variables

In C#, value types can’t be null. By definition, value types have a value, and even uninitialized variables of value types must have a value. This is called the default value for that type.  This leads to the following, usually unexpected result when checking if a variable is uninitialized:

  class Program {
      static Point point1;
      static Pen pen1;
      static void Main(string[] args) {
          Console.WriteLine(pen1 == null);      // True
          Console.WriteLine(point1 == null);    // False (huh?)
      }
  }

Why isn’t point1 null? The answer is that Point is a value type, and the default value for a Point is (0,0), not null. Failure to recognize this is a very easy (and common) mistake to make in C#.

Many (but not all) value types have an IsEmpty property which you can check to see if it is equal to its default value:

  Console.WriteLine(point1.IsEmpty);        // True

When you’re checking to see if a variable has been initialized or not, make sure you know what value an uninitialized variable of that type will have by default and don’t rely on it being null..

Common Mistake #3: Using improper or unspecified string comparison methods

There are many different ways to compare strings in C#.

Although many programmers use the == operator for string comparison, it is actually one of the leastdesirable methods to employ, primarily because it doesn’t specify explicitly in the code which type of comparison is wanted.

Rather, the preferred way to test for string equality in C# is with the Equals method:

  public bool Equals(string value);
  public bool Equals(string value, StringComparison comparisonType);

The first method signature (i.e., without the comparisonType parameter), is actually the same as using the == operator, but has the benefit of being explicitly applied to strings. It performs an ordinal comparison of the strings, which is basically a byte-by-byte comparison. In many cases this is exactly the type of comparison you want, especially when comparing strings whose values are set programmatically, such as file names, environment variables, attributes, etc. In these cases, as long as an ordinal comparison is indeed the correct type of comparison for that situation, the only downside to using the Equals method without a comparisonType is that somebody reading the code may not know what type of comparison you’re making.

Using the Equals method signature that includes a comparisonType every time you compare strings, though, will not only make your code clearer, it will make you explicitly think about which type of comparison you need to make. This is a worthwhile thing to do, because even if English may not provide a whole lot of differences between ordinal and culture-sensitive comparisons, other languages provide plenty, and ignoring the possibility of other languages is opening yourself up to a lot of potential for errors down the road.  For example:

  string s = "strasse";
  
  // outputs False:
  Console.WriteLine(s == "straße");
  Console.WriteLine(s.Equals("straße"));
  Console.WriteLine(s.Equals("straße", StringComparison.Ordinal));
  Console.WriteLine(s.Equals("Straße", StringComparison.CurrentCulture));        
  Console.WriteLine(s.Equals("straße", StringComparison.OrdinalIgnoreCase));
  
  // outputs True:
  Console.WriteLine(s.Equals("straße", StringComparison.CurrentCulture));
  Console.WriteLine(s.Equals("Straße", StringComparison.CurrentCultureIgnoreCase));

The safest practice is to always provide a comparisonType parameter to the Equals method. Here are some basic guidelines:

  • When comparing strings that were input by the user, or are to be displayed to the user, use a culture-sensitive comparison (CurrentCulture or CurrentCultureIgnoreCase).
  • When comparing programmatic strings, use ordinal comparison (Ordinal or OrdinalIgnoreCase).
  • InvariantCulture and InvariantCultureIgnoreCase are generally not to be used except in very limited circumstances, because ordinal comparisons are more efficient.  If a culture-aware comparison is necessary, it should usually be performed against the current culture or another specific culture.

In addition to the Equals method, strings also provide the Compare method, which gives you information about the relative order of strings instead of just a test for equality. This method is preferable to the <<=> and >= operators, for the same reasons as discussed above–to avoid C# problems.

Common Mistake #4: Using iterative (instead of declarative) statements to manipulate collections

In C# 3.0, the addition of Language-Integrated Query (LINQ) to the language changed forever the way collections are queried and manipulated.  Since then, if you’re using iterative statements to manipulate collections, you didn’t use LINQ when you probably should have.  

Some C# programmers don’t even know of LINQ’s existence, but fortunately that number is becoming increasingly small. Many still think, though, that because of the similarity between LINQ keywords and SQL statements, its only use is in code that queries databases.

While database querying is a very prevalent use of LINQ statements, they actually work over any enumerable collection (i.e., any object that implements the IEnumerable interface).  So for example, if you had an array of Accounts, instead of writing:

  decimal total = 0;
  foreach (Account account in myAccounts) {
    if (account.Status == "active") {
      total += account.Balance;
    }
  }

you could just write:

  decimal total = (from account in myAccounts
                   where account.Status == "active"
                   select account.Balance).Sum();

While this is a pretty simple example of how to avoid this common C# programming problem, there are cases where a single LINQ statement can easily replace dozens of statements in an iterative loop (or nested loops) in your code.  And less code general means less opportunities for bugs to be introduced. Keep in mind, however, there may be a trade-off in terms of performance. In performance-critical scenarios, especially where your iterative code is able to make assumptions about your collection that LINQ cannot, be sure to do a performance comparison between the two methods.

Common Mistake #5: Failing to consider the underlying objects in a LINQ statement

LINQ is great for abstracting the task of manipulating collections, whether they are in-memory objects, database tables, or XML documents.  In a perfect world, you wouldn’t need to know what the underlying objects are. But the error here is assuming we live in a perfect world.  In fact, identical LINQ statements can return different results when executed on the exact same data, if that data happens to be in a different format.

For instance, consider the following statement:

  decimal total = (from account in myAccounts
                   where account.Status == "active"
                   select account.Balance).Sum();

What happens if one of the object’s account.Status equals “Active” (note the capital A)?  Well, if myAccounts was a DbSet object (that was set up with the default case-insensitive configuration), the where expression would still match that element.  However, if myAccounts was in an in-memory array, it would not match, and would therefore yield a different result for total.

But wait a minute.  When we talked about string comparison earlier, we saw that the == operator performed an ordinal comparison of strings. So why in this case is the == operator performing a case-insensitive comparison?

The answer is that when the underlying objects in a LINQ statement are references to SQL table data (as is the case with the Entity Framework DbSet object in this example), the statement is converted into a T-SQL statement. Operators then follow T-SQL rules, not C# rules, so the comparison in the above case ends up being case insensitive.

In general, even though LINQ is a helpful and consistent way to query collections of objects, in reality you still need to know whether or not your statement will be translated to something other than C# under the hood to ensure that the behavior of your code will be as expected at runtime.

Common Mistake #6: Getting confused or faked out by extension methods

As mentioned earlier, LINQ statements work on any object that implements IEnumerable. For example, the following simple function will add up the balances on any collection of accounts:

  public decimal SumAccounts(IEnumerable<Account> myAccounts) {
      return myAccounts.Sum(a => a.Balance);
  }

In the above code, the type of the myAccounts parameter is declared as IEnumerable<Account>.  Since myAccounts references a Sum method (C# uses the familiar “dot notation” to reference a method on a class or interface), we’d expect to see a method called Sum() on the definition of the IEnumerable<T> interface.  However, the definition of IEnumerable<T>, makes no reference to any Sum method and simply looks like this:

  public interface IEnumerable<out T> : IEnumerable {
      IEnumerator<T> GetEnumerator();
  }

So where is the Sum() method defined? C# is strongly typed, so if the reference to the Sum method was invalid, the C# compiler would certainly flag it as an error.  We therefore know that it must exist, but where?  Moreover, where are the definitions of all the other methods that LINQ provides for querying or aggregating these collections?

The answer is that Sum() is not a method defined on the IEnumerable interface. Rather, it is a static method (called an “extension method”) that is defined on the System.Linq.Enumerable class:

  namespace System.Linq {
    public static class Enumerable {
      ...
      // the reference here to “this IEnumerable<TSource> source” is
      // the magic sauce that provides access to the extension method Sum
      public static decimal Sum<TSource>(this IEnumerable<TSource> source,
                                         Func<TSource, decimal> selector);
      ...
    }
  }

So what makes an extension method different from any other static method and what enables us to access it in other classes?

The distinguishing characteristic of an extension method is the this modifier on its first parameter. This is the “magic” that identifies it to the compiler as an extension method. The type of the parameter it modifies (in this case IEnumerable<TSource>) denotes the class or interface which will then appear to implement this method.

(As a side point, there’s nothing magical about the similarity between the name of the IEnumerable interface and the name of the Enumerable class on which the extension method is defined. This similarity is just an arbitrary stylistic choice.)

With this understanding, we can also see that the sumAccounts function we introduced above could instead have been implemented as follows:

  public decimal SumAccounts(IEnumerable<Account> myAccounts) {
      return Enumerable.Sum(myAccounts, a => a.Balance);
  }

The fact that we could have implemented it this way instead raises the question of why have extension methods at all?  Extension methods are essentially a convenience of the C# language that enables you to “add” methods to existing types without creating a new derived type, recompiling, or otherwise modifying the original type.

Extension methods are brought into scope by including a using [namespace]; statement at the top of the file. You need to know which namespace includes the extension methods you’re looking for, but that’s pretty easy to determine once you know what it is you’re searching for.

When the C# compiler encounters a method call on an instance of an object, and doesn’t find that method defined on the referenced object class, it then looks at all extension methods that are within scope to try to find one which matches the required method signature and class. If it finds one, it will pass the instance reference as the first argument to that extension method, then the rest of the  arguments, if any, will be passed as subsequent arguments to the extension method.  (If the C# compiler doesn’t find any corresponding extension method within scope, it will throw an error.)

Extension methods are an example of “syntactic sugar” on the part of the C# compiler, which allows us to write code that is (usually) clearer and more maintainable.  Clearer, that is, if you’re aware of their usage. Otherwise, it can be a bit confusing, especially at first.

While there certainly are advantages to using extension methods, they can cause problems and a cry for C# help for those developers who aren’t aware of them or don’t properly understand them. This is especially true when looking at code samples online, or at any other pre-written code. When such code  produces compiler errors (because it invokes methods that clearly aren’t defined on the classes they’re invoked on), the tendency is to think the code applies to a different version of the library, or to a different library altogether. A lot of time can be spent searching for a new version, or phantom “missing library”, that doesn’t exist.

Even developers who are familiar with extension methods still get caught occasionally, when there is a method with the same name on the object, but its method signature differs in a subtle way from that of the extension method. A lot of time can be wasted looking for a typo or error that just isn’t there.

Use of extension methods in C# libraries is becoming increasingly prevalent.  In addition to LINQ, the Unity Application Block and the Web API framework are examples of two heavily-used modern libraries by Microsoft which make use of extension methods as well, and there are many others. The more modern the framework, the more likely it is that it will incorporate extension methods.

Of course, you can write your own extension methods as well. Realize, however, that while extension methods appear to get invoked just like regular instance methods, this is really just an illusion.  In particular, your extension methods can’t reference private or protected members of the class they’re extending and therefore cannot serve as a complete replacement for more traditional class inheritance.

Common Mistake #7: Using the wrong type of collection for the task at hand

C# provides a large variety of collection objects, with the following being only a partial list:
ArrayArrayListBitArrayBitVector32Dictionary<K,V>HashTableHybridDictionaryList<T>NameValueCollectionOrderedDictionaryQueue, Queue<T>SortedListStack, Stack<T>StringCollectionStringDictionary.

While there can be cases where too many