[user]
name = Oleg Andreev
email = oleganza@gmail.com
[apply]
whitespace = strip
[diff]
color = auto
rename = copy
[pager]
color = true
# this one is very cool:
# green means "to be committed"
# red means "not to be committed"
[status]
color = auto
Must read.
Today I have received a letter:
^_^Hello Oleg,
I’m a Io newbie. I was watching some of your sample code on Github (loved funnyFactorial ;-) when I discovered your “learning french” subdir. I’m french and would be pleased to answer / comment / whatever about that language (not so human).
Github offers you a HTTP server for static data at http://yourname.github.com. Publishing is easy: just push content to git@yourname.github.com. (It is two months already, how could I miss!)
$ git rev-list -n 1 HEAD <path/to/folder>
Returns the latest commit, which modified a given path. This is useful to find out whether something has recently changed in the particular folder.
Young hacker looks at the figures: “2 hours for the feature Foo, 4 hours for the feature Bar”. He feels that kind of pressure: “I have to make it! I have to type faster, think faster, test faster.”
This is an awful feeling. So here’s the (possible) solution to this situation: try to think of time as of money you are investing. Tell yourself how much time of your life would you invest into this piece of #$@^ (of course, take into account your rate/salary). Now it looks like you score the feature Foo for just 2 hours: it doesn’t worth 4 hours or more. Spend 10-15 minutes for planning the way to spend that much time and do your best. If some trouble strikes and you’re out of time, just give up. Go to another feature and let this to be discussed on a weekly meeting when there’s time to schedule next iteration.
If the client wants a fixed price for software, you will not have any additional time. In such case - either do a dirty job, or work all the night. You to decide.
And when they drop back to French, discussion becomes a complete nonsense.
At this very moment I’m attending a meeting at The Big Company in France. There are six french folks around me speaking English instead of French. The only reason for that is me — I don’t speak French. It’s a bit hard for everyone to speak and understand English and initially I was a little bit ashamed of that. But soon I realized that the difficulty of speaking English makes everyone to focus on the essentials and prevents spoiling everyone’s time on the nonsense. Sweet.
See also slides (in a form of Factor plain text source code)
Apparently, there’s no conceptual problem with cloning a subdirectory like:
$ git clone git@server:project.git/docs/howto
You should just keep track of those tree objects referencing to “/”, “/docs” and “/docs/howto”, but fetch no references except children of the “/docs/howto” tree.
There’s a problem in Ruby: what if your application requires two libraries and both of them require incompatible versions of another library?
API designers who are smart enough create a namespace for each major version (MyLib::API1, MyLib::API2 etc.) so that you can have multiple versions of the same code in run time.
There’s a better solution however. Io language does not make you to modify the global state: source code can be loaded in any other object. This means that you don’t have to pollute library code with a version-based namespaces but you still able to load as many instances of the library as you want. Just make sure you keep them in your private objects.
Dreams come true:
MyParser := Package load(“XMLParser”, “e1fc39a02d786”)
You definitely should try these.
Exceptions are meant to be… ehm… exceptional. Exceptions are thrown when the code could not be executed further. They are meant to be thrown right at the point where something went wrong and passed up the stack till the program termination. Programmer should only provide “ensure” (in Ruby; or “finally” in Java) block to clean up. Programmer should never use “catch”/“rescue” block. Never.
There’s one little thing, however.
Sometimes you run your program and get silly exceptions like “connection lost” or “division by zero”. You become unhappy about it and you decide to implement an interface to deal with such errors. For example, when connection is lost you could show a message or do something smart (depends on the purpose of your program, of course).
But please remember: never ever catch exceptions you don’t know about (no “rescue nil” or “rescue => e”!). You should be very picky at what you are catching. Uncaught exception simply pops up in a system message or a log entry, so you can learn it. But silently caught exception might hide some nasty error from your eyes. And you wouldn’t be able to see in a stack trace what had happened few milliseconds before.
Both Ben and Yehuda are wrong.
They both use messy metaprogramming where Ruby has a nice solution already. It is a chain of modules and a super method.
If the base functionality is provided in a module it looks like this:
module BaseFeatures
def hello
"Hello, world!"
end
end
module AnyGreetingPlugin
def hello(arg = "world")
super.sub(/world/, arg)
end
end
class MyConfiguration
include BaseFeatures
include AnyGreetingPlugin
include AnotherPlugin
end
If your base functionality is packed in a class, rather than in a module — no problem, the solution is pretty much the same:
class MyConfiguration < BaseFeatures
include AnyGreetingPlugin
include AnotherPlugin
end
Now let me respond to each point from the Ben’s article:
1. “The method names don’t properly describe what is going on”. Module name describes what particular functionality it adds to the methods it contains.
2. “The base method shouldn’t care what other modules are doing, modularity is broken”. That’s not the case when you use regular modules.
3. “Logic is not 100% contained”. Logic is 100% contained: no magical macros anywhere.
4. “Promotes messy code”. Again, nothing to even seem messy.
5. “Exposes a documentation flaw”. When you think modules it is easy to separate concerns and understand how every little aspect of functionality affects everything else. You don’t have to speak any other language except Ruby. You think of module chains and message passing: no structure is created dynamically where not necessary. Only thing you have to do: describe in the documentation what this particular class or module is supposed to do. Then provide examples of a default configurations (when some modules are included in a single class) to make the picture complete. Respect the language. Keep it simple, stupid.
class User < ActiveRecord::Base; end
class Artist < User; end
class Investor < User; endI don’t understand why this would be a very bad idea ? All the users are stored in a same table as they have a lot of attributes and not much differ…
This starts with a naming of a user. As i wrote you recently, “User” name completely hides “role” from you, so it seems natural to put all the roles into User model. However, huge models tend to become harder and harder to modify and understand.
If you think of it that way:
- Person - holds authentication info
- Artist - holds info about music and albums
- Investor - holds info about artists and finance
the following becomes easy to play with:
- Person has many Artists (say, i can create several accounts for a number of my bands)
- Artist info can be edited by a group of People (my band members would like to update the news page/wiki/whatever)
- I (as a person) can represent several investors, or none at all.
- Investor can manage a number of artists, and/or a single artist can have several investors.
The reason to separate models by tasks is the very same as to separate objects from the top global Object class into more specific ones.
Speaking scientifically, it is just about “normalization” of a relational database.
If you have duplicating attributes, you have three equally good options (depending on your situation):
1. Mix them in using a module (e.g. “EntityWithTitleAndDescription”) if the duplication is just a coincidence, not a big deal (just to put some duplication into a single file to keep things cleaner).
2. Implement a separate model and associate it with appropriate models (e.g. “Account” could mediate between Person and Project to manage wikis/pages/documents/artists/etc. to avoid hardcore polymorphism between Person and Project). This is the case in a Basecamp-like app, where people have individual data as well as data, shared by a group (project).
3. Leave duplication as is: Coincidence pattern
Sometimes you have to have STI, but i believe this is not the case. E.g. i have PublicFolder and Inbox subclasses of a class Folder because they are a little special per se, not by their association to other folders.
Process consists of a number of phases. Each phase provides a feedback on its performance.
Instead of defining some performance threshold for each phase to start optimization at and asking ourselves “when should we start optimizing this?”, we should rather ask ourselves “which phase is to be optimized now?”. That is, we should collect all feedback, sort it and start optimizing most important phases first. Naturally, we end the process when we are not getting any visible performance gains anymore.
This strategy can be applied to dynamic programming language runtime as well as to any other controllable process.
At each callsite (source code point, where method dispatch happens) we can count
1) number of calls to a method
2) time spent inside the method
3) time spent in callsite (total time across all methods called at the point).
Time can be measured either in byte code instructions, machine instructions or in microseconds.
Lets look at possible situations:
In real code, we don’t meet frequently called very slow methods: it is often done by bad design and could not be efficiently optimized in runtime. But this chart helps us to define a metric for “hot spot” identification: the place in the code, where we start runtime optimization.
Such “hotspot” metric would be callsite time * number of calls. The higher this number, the higher priority should be given to such callsite at optimization phase.
Why don’t we just start with a top-level method? If we start from the very top, we would spend enormous amount of time optimizing the whole program instead of really interesting parts.
Now i can easily bring my favorite aliases and commands to different environments: macbooks, linux and freebsd servers.
Ruby symbols should be just immutable strings to avoid all that mess with string/symbol keys in option hashes. Just treat every literal string as immutable one. You could even keep this nice syntax with a colon to save some keystrokes.
So, basically:
String.should < ImmutableString
“symbol”.should == :symbol
Parser still fills a table of symbol when it encounters literal strings in the source code. So the unique number is just a string object_id.
When you compare two strings, you compare ids first and if they are not the same, you proceed with byte comparison. No need to convert :symbol.to_s.
How to make string a mutable one? Just #dup or #clone it.
“symbol”.dup.class.should == String
1. “Classic”: explicit require and manual dependencies resolution.
Pros: no bugs.
Cons: lazy programmers want to write and maintain less code.
2. “Rails method” (dependencies.rb): catch const_missing in runtime, guess the file name, load it.
Pros: zero-configuration, good automatic dependencies resolution.
Cons: file name guessing is a constraint, hard to debug runtime threading issues.
3. “Merb method” (load everything in a loop, reload failed files):
Pros: zero-configuration, moderate automatic dependencies resolution, no threading issues, issues are explicit and may arise in a bootstrap process only.
Cons: twice loaded constants yield warning, hidden repeated configuration bugs may appear.
4. “io language method” (load everything sorted by filename):
Pros: no bugs, zero-configuration, easy to understand, optional manual dependencies resolution.
Cons: 2-5% of files are named with ugly prefixes like A0_message.io (higher priority) or Z_exceptions.io (lower priority).
Note: no method saves you from manual resolution of the two-way referencing issue (when A references B and B references A). You still have to create a special “header” file to put bootstrap definitions into.
Conclusion: Merb method is essentially Io method with additional “feature” (which is buggy by nature) of double loading files in case of LoadError. If you name your files according to Io method and resolve all interdependencies manually, Merb will load them once and in the correct order.
We have a small team working with SVN. I like to write code under Git. Everyone is okay with command line except Sylvaine, our designer. She uses nice update/commit menu item in the Finder for her SVN-tracked PSD files.
So, here are all the facts:
0. We are working with Mac OS X.
1. I want all revisions stored on my machine.
2. I want good branch support for the source code.
3. I don’t need sophisticated branches for PSD-like documents.
4. Sylvaine probably does not want to open terminal to type “ga maquette.psd && gc ‘new block ’ && gush”. She wants to do this from the Finder with few clicks.
5. We have awful Mac GUIs for both Git and Mercurial (none for Git and halfass MacMercurial)
6. We have git-svn which works fine.
Result: keep individual documents under SVN, enjoy GUI and use git-svn in the Terminal. For source code use Git.
Everybody might be happy.
We already know CSV and TSV: comma-separated- and tab-separated- table formats. CSV looks ugly in human-readable files, TSV suffers from the invisible whitespaces and editor-dependent formatting.
I suggest another format for human-readable files: a list of records with fields separated with two or more spaces.
This helps to format the file with spaces only and avoid bloating file with quotes (and quote escape sequences). Text looks nice and light without unnecessary commas, quotes, backslashes and invisible tabs.
The only rule is:
1. Lines are separated by [\r\n]+
2. Fields are separated by \s\s+
The parser is obvious.
Actually, this format is already used in a variety of config files in *nix systems. Most noticeable example is /etc/fstab.
PS. Never use tab character where possible!
Some web2.0 guys might not understand quite well what each part of the “ehm-veh-ceh” pattern should and should not do. They should forget about “fat model and thin controller” misconcept. Today we try to help them. (Code should be light in either case.)
First of all, a simple example about “why it matters”:
A music player contains two volume controls and a sound stream. One control is a slider, another one is a “mute” button. Sound stream has an API with “volume”, “bass”, “track number” properties. If you try to modify sound stream directy within button and slider event callbacks, you will face a problem of UI consistency: when mute button is clicked, slider should be disabled or slide down. Ad hoc attempt is to bloat your callbacks with bunch of if-then-else branches checking all the possible cases in the whole UI in each callback!
A better way is to setup a single controller for all the volume controls with two actions: “triggerMute” and “setVolume(level)”. Then, subscribe each button to a volume property of the sound model, so that slider moves a handle according to the current volume level and mute button looks “unpressed” when the level is > 0. The final step is a controller which simply sets the volume level or triggers it. The only visible if-then-else branch here is the triggerMute method which should check the current volume level before changing it. However, this is an application-specific behavior which is out of scope of MVC pattern.
Generally speaking, your application is about “entities”. Given that:
1. Models are about state of entities. They describe consistent state and referential integrity within the whole system. Model must guarantee that any transition will result into consistent state. Model defines validations and constraints, but doesn’t define how transitions are performed (e.g. doesn’t check permissions). In case of database record, it should allow to change any values in any way, but ensures the consistency of the state just before transaction is committed. Perfect Model does not have “action” methods, but reader/writer accessors only.
When you write tests for a model (you write them, don’t you?) you should not describe an “operational process” with a model, but rather check that any possible update yields a valid state or appropriate errors are raised.
2. Controllers, on the other hand, are about actions. Controller does not bother about consistency of state (it may be stateless) but it specifies the control flow. That is, checks permissions, creates, deletes and moves entities around.
In a test for a controller you should describe initial state of the system, perform an action and then check the resulting state.
3. Views are about representation of the entities. View does not modify the entity, but it accesses its data, observes changes made to a model and generates events for a controller. When you test a view, you should define a set of entities (the input) and check their representation (the output).
The most simple and powerful MVC picture looks like that:
M -> V
V -> C
C -> M
View observes the state (M.) and updates itself when state is changed.
Controller observes the view and performs appropriate actions when some controls are triggered.
Bonus track. How to write perfect tests for a music player “volume feature”?
1. Model test verifies that volume level could not be below zero or over 9000.
2. Controller test checks several values for the “setVolume” method and two cases for the “triggerMute” method: with level = 0 and level = 1. Before the test, volume level is set to some value and after the action it is checked.
3. View test should check that the slider and the button generate events for the controller and observe events generated by the model.
Goddamn, developer MUST be independent. Switch on your brain, please. RTFM.
Software source code is not a big truck a static thing. It evolves over time. Some piece of code being repeated in several places doesn’t always mean that it should be DRYed right away. There’s always such thing as coincidence. When you are putting some functionality into reusable module it should have a meaning applied to the business logic, not just “repeated lines of code”.
What’s the big problem with the “premature drying”? When compiler looks at the code, it just executes it. It doesn’t really matter how deep in the classes and functions some expression is located. It is just a matter of performance.
But when developer looks at the code, he constructs abstractions and loads them into his brain. Developer speaks abstractions language, not expressions. When I see some method/module being used in several places, I understand that as a kind of relationship existing between those places of the system. I see something in common. This idea influences further understanding and I’d better not to be wrong at some point.
Rule of thumb: don’t try to save less than 273 bytes of code if it may lead to confusion and create more abstractions or rules to be remembered.
1. Controller is not aware of our business logic and that’s a good thing.
Project.new(params[:project])
2. Initializing objects using hashes is convenient. Also, DataMapper uses it internally to initialize associations.
Project.new(:any => :param, :goes => :here)
3. Some params are special and are not allowed to be manipulated by user.
Project.new(params[:project].reject{|k,v|
k.to_s =~ /^id|owner(_id)?$/
})
4. Some params are not so special, but are accessible by specific user groups.
@project.update_attributes(params[:project].reject{|k,v|
k.to_s =~ /^owner(_id)?$/ &&
@project.owner != current_person
})
5. While models maintain consistent state of the system (key uniqueness, correct data formats and relationships), controllers maintain control flow (hence the name) along with authentication and authorization.
6. Therefore, currently used mass assignment protection implementations do not solve the problem. attr_accessible/attr_protected methods in ActiveRecord get in your way. A plugin for DataMapper I wrote yesterday also doesn’t help.
7. The right solution is a mix-in module for controllers with before filters, which should safely instantiate all necessary models.
0. When ruby parser encounters assignment statement on some term looking like a local variable (a = 1, b ||= 2 etc.), it creates such variable. That’s why it knows later that “a” is a local variable, not a method call.
1. Sometimes this leads to a confusion:
def x; :x; end def y x = 42 if false # assignment is not executed x end y #=> nil (not :x as someone could predict)
# partial will have a local variable named "some_option" partial(template, object, :some_option => 1) # in this case there won't be a local variable "some_option" partial(template, object)Inside the partial you cannot simply write if some_option because it will raise NameError in case :some_option was not passed into the partial. However if (some_option ||= nil) works in either case.
var Y=function(g){return function(f){return f(f)}(function(f){
return g(function(){return f(f).apply(null, arguments)})})}
var factorial=Y(function(h){return function(n){return ( n $ скл Олег Андреев Олег Андреев Олега Андреева Олегу Андрееву Олега Андреева Олегом Андреевым Олеге Андрееве
$ gem sources -a http://gems.github.com $ sudo gem install yaroslav-yandex_inflect
# ~/.bashrc alias inf="ruby -rubygems -e'require \"yandex_inflect\"; puts YandexInflect.inflections(ARGV.join(\" \"))'" alias скл=inf alias склон=inf alias склонять=inf alias склонение=inf
$ sudo gem update --system
Password:
Updating RubyGems
Updating rubygems-update
Successfully installed rubygems-update-1.3.1
ERROR: While executing gem ... (NameError)
undefined local variable or method `remote_gemspecs' for
#<Gem::Commands::UpdateCommand:0x11dfa8c>
There are always holy wars around Perl, Python, PHP, Ruby, Java and C#. We have enormous freedom on the server side comparing to a quirky browser environment. Why don’t we use Javascript on the server?
2-3 years ago server-side JS wouldn’t give us anything what wasn’t already available in other environments. But today we build rich applications with tons of code in javascript. Don’t we have some code to be shared with the server?
jQuery, prototype.js and MooTools may be used as a template engines, much flexible than <%%> and <??> tags, but also nicer and more lightweight than XSLT.
Selectors and transformations could be used both on the server side (to produce statically generated pages) and client side (for partial ajax updates).
Raw data interchange becomes easier. Also, you have common testing toolkit.
Recent JS implementations outperform Ruby, Python and Perl. Mostly, due to JS simplicity (no method_missing(), no classes and metaclasses etc.), but also because of enormous experience comparing with legacy server-side technologies. While Ruby, Python and Perl are used by X developers each, JS is used by (3*X + Y) developers plus hundreds of millions of mere mortals all around the world.
I believe, there’ll be web-server written for some open source JS engine very soon (Apache and Java don’t count).
Must-read for every programmer interested in Io optimization.